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Preface 



This volume contains the proceedings of the 5th International Conference on 
Verification, Model Checking, and Abstract Interpretation (VMCAI 2004), held 
in Venice, January 11-13, 2004, in conjunction with POPL 2004, the 31st Annual 
Symposium on Principles of Programming Languages, January 14-16, 2004. The 
purpose of VMCAI is to provide a forum for researchers from three communities — 
verification, model checking, and abstract interpretation — which will facilitate 
interaction, cross-fertilization, and the advance of hybrid methods that combine 
the three areas. With the growing need for formal tools to reason about complex, 
infinite-state, and embedded systems, such hybrid methods are bound to be of 
great importance. 

Topics covered by VMCAI include program verification, static analysis tech- 
niques, model checking, program certification, type systems, abstract domains, 
debugging techniques, compiler optimization, embedded systems, and formal 
analysis of security protocols. 

This year’s meeting follows the four previous events in Port Jefferson (1997), 
Pisa (1998), Venice (2002), LNCS 2294 and New York (2003), LNCS 2575. In 
particular, we thank VMCAI 2003’s sponsor, the Courant Institute at New York 
University, for allowing us to apply a monetary surplus from the 2003 meeting 
to this one. 

The program committee selected 22 papers out of 68 on the basis of three re- 
views. The principal criteria were relevance and quality. The program of VMCAI 
2004 included, in addition to the research papers, 

— a keynote speech by David Harel (Weizmann Institute, Israel) on A Grand 
Challenge for Computing: Full Reactive Modeling of a Multicellular Animal, 

— an invited talk by Dawson Engler (Stanford University, USA) on Static Anal- 
ysis Versus Software Model Checking for Bug Finding, 

— an invited talk by Mooly Sagiv (Tel Aviv University, Israel) called On the 
Expressive Power of Canonical Abstraction, and 

— a tutorial by Joshua D. Guttman (Mitre, USA) on Security, Protocols, and 
Trust. 

We would like to thank the Program Committee members and the reviewers, 
without whose dedicated effort the conference would not have been possible. 
Our thanks go also to the Steering Committee members for helpful advice, to 
Agostino Cortesi, the Local Arrangements Chair, who also handled the con- 
ference’s Web site, and to David Schmidt, whose expertise and support was 
invaluable for the budgeting. Special thanks are due to Martin Karusseit for 
installing, managing, and taking care of the METAFrame Online Conference 
Service, and to Claudia Berbers, who, together with Alfred Hofmann and his 
team at Springer-Verlag, collected the final versions and prepared the proceed- 
ings. 
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Special thanks are due to the institution that helped sponsor this event, the 
Department of Computer Science of Ca’ Foscari University, and to the profes- 
sional organizations that support the event: VMCAI 2004 is held in cooperation 
with ACM and is sponsored by EAPLS. 

January 2004 Bernhard Steffen 
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Security, Protocols, and Trust* 



Joshua D. Guttman 
guttmanSmitre . org 

http: //www. ccs.neu.edu/home/guttman 



Information security has benefited from mathematically cogent modeling and 
analysis, which can assure the absence of specific kinds of attacks. Information 
security provides the right sorts of problems: Correctness conditions may be 
subtle, but they have definite mathematical content. Systems may be complex, 
but the essential reasons for failures are already present in simple components. 
Thus, rigorous methods lead to clear improvements. 

In this tutorial, we focus on one problem area, namely cryptographic proto- 
cols. Cryptographic protocols are often wrong, and we will start by studying how 
to break them. Most protocol failures arise from unintended services contained 
in the protocols themselves. An unintended service is an aspect of the protocol 
that requires legitimate principals unwittingly to provide an attacker with in- 
formation that helps the attacker defeat the protocol. We describe a systematic 
way to discover unintended services and to piece them together into attacks. 

Turning to the complementary problem of proving that there are no attacks 
on a particular protocol, we use the same insights to develop three basic patterns 
for protocol verification. These patterns concern the way that fresh, randomly 
chosen values ( “nonces” ) are transmitted and later received back in cryptograph- 
ically altered forms. We explain how these patterns, the authentication tests, are 
used to achieve authentication and to guarantee recency. They serve as a design 
method as well as a verification method. 

In themselves, however, these methods do not explain the commitments that 
a principal makes by specific protocol actions, nor the trust one principal must 
have in another in order to be willing to continue a protocol run. In the last part 
of the tutorial, we describe how to combine protocol analysis with a trust man- 
agement logic in order to formalize the trust consequences of executing protocols 
for electronic commerce and access control. 



* Supported by the United States National Secnrity Agency and the MITRE- 
Sponsored Research Program. 
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Security Types Preserving Compilation* 

(Extended Abstract) 



Gilles Barthe^, Amitabh Basu^**, and Tamara Rezk^ 

^ INRIA Sophia-Antipolis, France {Gilles. Barthe, Tamara. Rezk}@sophia. inria.fr 
^ IIT Delhi, India csu00099@cse . iitd. ernet . in 



Abstract. Initiating from the seminal work of Volpano and Smith, there 
has been ample evidence that type systems may be used to enforce con- 
fidentiality of programs through non-interference. However, most type 
systems operate on high-level languages and calculi, and “low-level lan- 
guages have not received much attention in studies of secure informa- 
tion flow” (Sabelfeld and Myers, [16]). Further, security type systems for 
low-level languages should appropriately relate to their counterparts for 
high-level languages; however, we are not aware of any study of type- 
preserving compilers for type systems for information flow. 

In answer to these questions, we introduce a security type system for 
a low-level language featuring jumps and calls, and show that the type 
system enforces termination-insensitive non-interference. Then, we intro- 
duce a compiler from a high-level imperative programming language to 
our low-level language, and show that the compiler preserves security 
types. 



1 Introduction 

Type systems are popular artefacts to enforce safety properties in the context of 
mobile and embedded code. While such safety properties fail short of providing 
appropriate guarantees with respect to security policies to which mobile and em- 
bedded code must adhere, recent work has demonstrated that type systems are 
adequate to enforce statically security policies. These works generally focus on 
confidentiality and in particular on non-interference [7] , which ensures confiden- 
tiality through the absence of information leakage. Initiating from the seminal 
work of Volpano, Smith and Irvine [20], type systems for non-interference have 
been thoroughly studied in the literature, see e.g. [16] for a survey. However, most 
works focus on high-level calculi, including A-calculus, see e.g. [8], 7r-calculus, see 
e.g. [9], and c-calculus [3], or high-level programming languages, including Java 
[2,12] and ML [15]. 

In contrast, relatively little is known about non-interference for low-level 
languages, in particular because their lack of structure renders control flow more 
intricate; in fact existing works, see e.g. [4,5], use model-checking and abstract 

* Work partially supported by 1ST Projects Profundis and Verificard. 

** This work was performed while the author was visiting INRIA Sophia-Antipolis. 
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interpretation techniques to detect illegal information flows, but do not provide 
proofs of non-interference for programs that are accepted by their analysis. Thus 
the first part of this paper is devoted to the definition of a security type system 
for a low-level language with jumps and calls, and a proof that the type system 
enforces termination-insensitive non-interference. 

Of course, security type systems for low-level languages should appropriately 
relate to their counterparts for high-level languages. Indeed, one would expect 
that compilation preserves security typing. Thus the second part of the paper is 
devoted to a case study in compilation with security types: we define a high-level 
imperative language with procedures, and a compiler to the low-level language 
studied in the first part of the paper. Further, we endorse the language with a 
type system that guarantees termination-insensitive non-interference, and show 
that compilation function preserves typing. The proof that compilation preserves 
typing proceeds by induction on the structure of derivations, and can be viewed 
as a procedure to compute, from a certificate of well- typing at the source pro- 
gram, another certificate of we 11- typing for the compiled program. It is thus very 
close in spirit to a certifying compiler [13]. 



Contents. The remaining of the paper is organized as follows. In Section 2 
we define an assembly language that shall serve as the compiler target, en- 
dorse it with a security type system, and prove that the type system enforces 
termination-insensitive non-interference. In Section 3, we introduce a high-level 
imperative language with procedures and its associated type system. Further, 
we introduce a compiler that we show to preserve security typing; we also dis- 
cuss how type-preserving compilation can be used to lift non-interference to the 
high-level language. We conclude in Section 4, with related work and directions 
for further research. 



2 Assembly Language 

2.1 Syntax and Operational Semantics 

The assembly language is a small language with jumps and procedures. A pro- 
gram P is a set of procedures with a distinguished, main, procedure; we let Pf be 
the procedure associated to an identifier / in P. Each procedure Pf consists of 
an array of instructions; we let Pf[i] be the f-th instruction in Pj. The set Instr 
of instructions and the set Prog^ of compiled programs are defined in Figure 1. 
We often denote programs by Pc :: [/ := i*]* ■ Given a program P, we let W 
be its set of programs points, i.e. the set of pairs (/, z) with f G P, where T is 
a set of procedure names, and i G dom(Pf). Further, we assume programs to 
satisfy the usual well-formedness conditions, such as code containment: for every 
program point (/, z), Pf[i] = if J j G dom{Pf), etc. 

The operational semantics is given as a transition relation between states. 
In our setting, values are integers, i.e. V = Z and states are triples of the form 
(cs, p, s) where cs G W* is a call string whose length is bounded by some 
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i prim op primitive value/operation 
I load X load value of x on stack 
I store X store top of stack in x 
I if j conditional jump 
I goto j unconditional jump 
I call / procedure call 
I return return 

where op is either a literal n G Z, or a primitive operation , x , . . or a comparison 
operation <, / ranges over a set T of procedure names, x ranges over a set 

X of registers, and j ranges over N. 

Fig. 1. Instruction set 



previously agreed upon value max, p : df — >■ V is a register map that assigns 
values to local variables (note that X is finite for any fixed program), and s is an 
operand stack, i.e. a stack of values. The operational semantics of the assembly 
language is given by the rules of Figure 2; all rules are subject to the proviso 
that the size of the call string and the operand stack remain bounded by some 
previously agreed upon maximal size max and MAX; further we assume given for 
every operation symbol op a corresponding total binary function op on integers. 
Finally, we write p®{x i— 1- u} to denote the unique function p' s.t. p'{y) = p{y) 
a y ^ X and p'{x) = v. 

Observe that procedure calls do not activate a new frame with its own local 
variables and operand stacks, as e.g. in the JVM; in fact, procedures are closer 
to JVM subroutines than they are to JVM method invocations. 
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2.2 Defining Non-interference 

Informally, non-interference guarantees that, executing a program P on initial 
states that are indistinguishable from the point of view of an attacker will not 
result in observable differences for the attacker. There are however a number of 
different ways in which this definition can be made precise, depending on the 
formulation of observability. One notion, that seems well adapted to our context, 
is termination-insensitive non-interference, which says that given any two states 
i and i', and assuming that executing P on i and i' respectively yield as final 
states / and /', indistinguishability between i and i' entails indistinguishability 
between / and /'. Note that such a definition implicitly assumes that an at- 
tacker cannot observe termination; there are stronger notions of confidentiality 
that consider termination and even execution time as observable, see e.g. [16]. In- 
distinguishability is a relation between states and is defined w.r.t. security maps 
that assign security levels to registers and to program points; throughout this 
paper, we assume that the set of security levels is 5 = {H, L} and is ordered by 
the clause L < H-, considering a lattice of security levels instead is possible but 
adds technicalities without adding insight. Indistinguishability on states is de- 
fined in terms of indistinguishiblity relations on register maps, and on operand 
stacks. The former is a pointwise extension of the obvious indistinguishibility 
relation on values. 

Definition 1 (Values and register maps indistinguishability). 

— Value indistinguishability v ^sl v' (of values v and v' w.r.t. security level 
SL) is defined as SL = H V v = v' . 

— The relation is extended pointwise to maps: for p, p' : X ^ V and P \ X ^ S, 
we have p p' is defined as 

Vx G V. {p x) ~(r 3 ,) (p' x) 

At this point, it is already possible to define non-interference for programs. 
Definition 2. A program P whose main procedure is main is non-interferent 
w.r.t. P \ X ^ S, written NI/^(P), if for every p,p',p,,p' : X ^ V such that 
p p' and ((main, 1), p, e) (e, p, e) and ((main, 1), p', e) (e, p', e), 
we have p, p' ■ 

Stack indistinguishability requires a slight generalization of the pointwise 
order on stacks so as to handle high if with branches of different length(a moti- 
vation for this is given in section 2.4). The intuition is that we require operand 
stacks to be indistinguishable point- wise on some common top part, and then 
to be high in the bottom part on which they may not coincide. High operand 
stacks are defined relative to a stack type: formally, let s be an operand stack 
and st be a stack type; we write high^^ (s, st) if s and st have the same length 
n and .st[i] = H for every 1 < i < n. 

Definitions (Operand stack indistinguishability). Let s, s' be operand 
stacks and st, st' G ST. Then s ^st,st' s' is defined inductively as follows: 

highgS (s, st) high^s (s',st') s -^st,st' s' v^kv' 

k::st, k:\st' V S 






V :: s 
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Definition 4 (State indistinguishability). State indistinguishability 
a ^r,st,st' o"' (of states a = (cs,p,s) and a' = {cs',p',s') w.r.t. register type F 
and stack types st and st' ) is defined as s ^st,st' s' A p p' ■ 



2.3 Control Dependence Regions 

Type systems such as [18,20] reject programs that yield implicit flows through 
low assignments in a high branching instruction, e.g. if yn then xp ■= 0 else xp := 
1 that yield implicit flows through low assignments in a high branching instruc- 
tion; in this program, the final value of xp depends on yn and thus the program 
is insecure. 

In order to proceed likewise for our assembly language, we must resort to 
control dependence regions, which identify for every if instruction program points 
that execute under its control condition. While such control dependence regions 
are easy to identify in a structured, source language, their computation is slightly 
more intricate for unstructured assembly languages, but see e.g. [1] for algorithms 
that compute such regions. 

For the purpose of this paper, we need not be precise about the exact compu- 
tation of such control dependence regions. Instead, we define for every program P 

rrii = {{f,^)(^rr\pf[l] = \f j} 

and assume given two functions reg : Wif -A P (W) that computes the control 
dependence region of an if and Jun : VPif -A W that computes the junction 
point of the two branches of the if. Formally, we assume that 

for every (/', i') G VVa, if (/', i') G reg((/, i)) then reg((/', i')) C reg((/, i))— 
we latter refer to this property as RIP or region inclusion property — and that 
for every execution path 

((/,*):: cs,p,s) (csi,pi,si) ^ ... (cs„, p„, s„) 

one of the following holds: 

— csn = ifsfis) ■■■■ ■■■ : (/i,ii) :: cs with (/fc,Zfc) G reg((/,i)) for 1 < A: < s; 

— there exists 1 <l <n such that cs; = jun((/, i)) :: cs; 

— there exists 1 < I < n such that cs; = (/, j) :: cs and Pf[j] = return. 

We shall use the function reg in the type system, and the assumption about 

execution paths in the proof of non-interference. 



Remark. Strictly speaking, the function jun needs not be total. However, we can 
always extend jun to a total function with the required property by supposing 
that the last instruction of the main procedure is a return. 
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2.4 Type System 

The type system is defined through an abstract transition system that manip- 
ulates stack types, and a security environment that associates to each program 
point a security level, and is parameterized by a register type T : T — 5 that 
sets the security level of each register. Before giving its formal definition, we 
motivate the type system on representative examples. 

Example 1. Consider the following piece of program, where is a low variable 
and yH is a high variable: 

load yn 
store XL 

The first instruction pushes the value held in y^ on top of the operand stack, 
while the second instruction stores the top of the operand stack in xl- Thus this 
piece of program stores in the variable x l the value held in the variable yn , and 
yields a direct information leakage. Our type system prevents such explicit flows 
by restricting the transition of an instruction store xl to the case where the type 
in the top of the stack type is low, whereas the instruction load y^j pushes a H 
in the stack type. 

The security environment se : VV — > S detects illicit flows by recording for each 
program point pp the highest security level of the control dependence influence 
under which the program point pp is. 

Example 2. Consider the following pieces of program, where is a low variable 
and yn is a high variable: 



1 load yn 


1 prim 0 


2 if 6 


2 store x l 


3 prim 0 


3 load yn 


4 store xl 


4 if 6 


5 goto 8 


5 return 


6 prim 1 


6 prim 1 


7 store xl 


7 store xl 


8 return 


8 return 



These pieces of program yield implicit flows, as a test on a high value yields 
different values for a low variable. Indeed, at program point 8, the low variable 
XL contains the value 0 if yn contains the value 0 or the value 1 if yn contains 
a value different from 0. Our type system prevents such information leakage by 
imposing that the abstract transition rule for a high if instruction sets to H 
the security level of all program points in its control dependence region, and by 
rejecting low assignments and returns — unless the return is the last instruction 
of the procedure being executed — that are performed at high program points. 

The abstract transition system is defined by the typing transfer rules in 
Figure 3. These rules, which capture the transformation of security types infor- 
mation by instructions, are of the form 

cs = (/, f) :: csq Pf[i] = instruction 
r, cs h st, se sF, se' 
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where -T is a fixed register type, and st, se determines typing constraints for 
cs, and st', se' determines typing constraints for the successors of cs. Note the 
transfer function rules define a partial function, i.e. N, cs h st, se sti, sei and 
r,cs h st, se => st 2 , SC 2 implies sti = st 2 and sei = se 2 - 

The successor relation i— >-C CS x CS, where CS is the set of PP-lists of length 
< max, is defined by the clauses: 

~ ifp/[z] = return then (f,i) :: (f,j) :: cs' (/', j + 1) :: cs' and {f,i) :: e e; 

- iipf[i] =call /' then (f,i) :: cs (/',!) :: (f,i) :: cs; 

- if pf[i\ = goto j then (f,i) :: cs (/, j) :: cs; 

~ if Pf[i] = if j then (/, i) :: cs >->• (/, k) :: cs for k £ {i + 1, j}; 

- otherwise, (f,i) :: cs {f,i + 1) :: cs. 

Type-checking is performed by a dataflow analysis that explores all ab- 
stract execution paths; following Brisset and Coglio [6], we opt for a poly- 
variant analysis. Hence our type system deals with security types of the form 
CS P{ST X iSf), where the set ST of stack types is defined as the set of 
5-stacks of length smaller than MAX, and the set SS of security environments 
is defined as PP — >■ S; given a security type S and a call string cs we let Scs 
denote 5(cs). For the purpose of this paper, we work with judgments of the form 
r, S,cs\- P where 5 is a security type; the typing rule is 

Vst, se G Scs- Vcs' G CS. cs i~>- cs' 3st', se' G Scs'- P, cs h st, se st' , se' 

T,5,csh P 

(There are standard algorithms to compute S when it exists, see e.g. [11]). 
Finally, we say that a program P has type S w.r.t. P, written P,S\-P, if 
P,S, cs h p for all cs s.t. (main, 1) :: e i— >■* cs, where i— >■* denotes the transitive 
closure of i— >■. As usual, we say that P is typable w.r.t. P, written P \- P, if 
P,S\-P for some S. 

2.5 Soundness 

Typable programs are non-interferent. 

Theorem 1. If P \- P then Nlr(T’). 

The idea of the proof is as follows: first, we prove in Lemma 1 that indistin- 
guishability is preserved under one step of execution, if the program is typable. 
Second, we prove in Lemma 2 that one step execution in a high-level environ- 
ment yields a result state that is indistinguishable from the original one. By 
combining these results together, we conclude. 

In the sequel, we use s • cs to denote the call string of a state s and hd to 
denote the head function. We also write high st if all elements in the 5-stack s 
are high. We also write P h cs, st, se cs', st' , se' if P, cs h st, se st' , se' and 

cs i-> cs', and use T F • to denote its transitive closure. 
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CS = PfiA — X 

r. cs h st, se {r{x) U se{f, 2)) :: st, se 

cs = (/, 2) :: cs^ Pf [2] = store x fc U se(2) < P{x) 
r,cs\- k :: st, se st, se 

cs — {f,i) :: cs^ ^f[^] = prim n 
r, cs h st, se =>■ se{f, i) :: st, se 

cs = (/, 2) :: cs^ -P/[^] = prim op 
r, cs h ki :: ^2 :: st, se {ki U ^2 L-l ^€(/, 2)) :: st, se 

cs = if ,i) :: cs Pf[{\=]fj 

r, CS \- k :: st, se ^ liftfc (st) , liftfc {se, reg((/, 2))) 

cs — (/, 2) :: cs' Pf[i] — return se{f, 2) — LVi+l^ dom{Pf) 

P, cs h st, se st, se 

cs = {f,i) ■■■■ cs' Pf[i] g {goto j.call /'} 
r, cs h st, se =>■ st, se 

where 

— U denotes the lub of two security levels; 

— liftfc(st), where k G S, denotes the pointwise extension to the stack type st of the function 
XL k Li 1 ; 

— liftfc(se,K), where k ^ S, denotes the pointwise extension for all program points in R of the 
function XL kU 1 . 

Fig. 3. Transfer rules for instructions 



Lemma 1 (One-Step Noninterference in Low-Level Environments). 

Suppose r,S h P. Let si, S2, s{, s'2 be states with si • cs = S2 ■ cs and let 
{sti,se),{st2,se) G Ss,.cs be security types s.t. si ~r,sti,st2 «2, and si ^ s'^, 
and S2 S2- 

Then there exist (st(,se() G S^'.cs and G A'-cs s.t. c^r,st[,st'2 ^'2- 

Furthermore, one of the following holds: 

— se'i = se'2 = se and s( • cs = sU cs; 

— Si • cs = (/, i) :: cs' and Pf[i] = if j and hd sti = hd st2 = H. 

Proof. By a case analysis on the instruction that is executed. 

Lemma 2 (One-Step Noninterference in High-Level Environments). 

Suppose P, S \- P. Let s, s' be states and {st, se) G Sg.cs be a security type 
s.t. s • cs = (/, i) :: cs' , s s' , high st, and se{{f,i)) = PI; Then there exists 
{st' , se') G Ss'.cs s.t. high st' , and s c^r,st,st' s', and P,s ■ cs\~ {st, se) => {st' , se'). 

Proof. By a case analysis on the instruction that is executed. 

Proof (of Theorem 1 ). Consider the two execution paths 

So Si ^ ... s„i 

••• '^<2 
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where sq = ((main, 1) :: e,p,e), and Sq = ((main, 1) :: e, p',e), and • cs = 

^ri2 ' 

By invoking Lemma 1 as long as it applies, we conclude for some maximal 
q that Sq • cs = s'y • cs, and that there exists {stq, se), {st'q, se) € Ss^-cs such 
that Sq- Now there are two cases to treat: if Sq • cs = e then rzi = 

U 2 = q and we are done; otherwise, the last instruction executed is an if high. 

By the typing rule of the if instruction, it holds high stq and high st'q. We 

now invoke Lemma 2 repeatedly to conclude that there exists Sq^ and s'q^ and 

(stq^,sei) G Ss^^-cs and {st'q^,se'2) G Ss'^^-cs such that Sq ^r,stq,st<n ^nd 

s'q ^r,st'q,st’q^ 'Sqa- transitivity, we conclude Sq^. Further, we 

can choose q\ and q 2 to be the minimal indexes such that Sq^ -cs = jun((/, z)) :: cs' 
and Sq2 -cs = jun((/, z)) :: cs' respectively. As F h Sq-cs, stq, se =J>* Sq^ -cs, stq^ , sei 
and F h Sq • cs, st'^, se' =J>* Sq^ • cs, st'^^, se^ and only if statements may modify 
the security environment, and we assume RIP, we can further conclude that 
Sei = S62- 

Thus we can apply Lemma 1 again, and repeat the process until reaching the 
final states of the reduction sequences. 



3 Security Type Preserving Compilation 

In this section, we define a high-level imperative language, endorse it with a 
security type system, and introduce a compiler from the source language to the 
assembly language. Then we show that the compiler preserves security types, 
and derive as a corollary that the security type system for the source language 
enforces non-interference. 



3.1 Source Language 

The source language is a simple imperative language with procedures. A proce- 
dure is a declaration of the form proc f{x) = c; return where / is a procedure 
name and c is a command. As with the assembly language, we assume that a 
program is a list of procedures with a distinguished, main, procedure without 
parameters. Formally, the set Expr of expressions, Comm of commands, and Prog 
of programs are given by the following syntaxes: 

e ::= x | rz | e op e 

c ::= X := e I / (e) I c; c I while e do c | if e then c else c 
P ::= [proc f{x) = c; return]* 

The operational, big-step semantics of programs is based on judgments of the 
form (c, p) p', where c G Comm and p,/z' : A — >• V. Rules are standard, 
see e.g. [18,20], and omitted. The security type system is based on judgments 
of the form F I-5 e : r and P \~s c : t cmd. A program P is typable, written 
F hs F, if F I -5 main : r cmd for some r. The typing rules are inspired from [18, 




Security Types Preserving Compilation 



11 



r h 



r \- p 



• cmd 



r' < 



(Sub),, 




r \- e ■. r 


(Sub)„ 


P h P : cmd 




r(x) = T 






(Var) 






(Val) 


r \- n : T 


r 


\- X : T 








(Op) 


r 


h e : T r \- e' : T 


(Assign) 


P h e : T P{x) — F 




r \- e op e' : T 


r \- X e T cmd 






(Seq) 


r 


h P : r cmd P h Q : r cmd 


(While) 


PheiT PhP:r cmd 




r \- P : Q : T cmd 

Ph err P h P 


P h while e do P : t cmd 






(Cond) 




: r cmd 


P h Q : T cmd 






P h if e then 


P else Q 


: T cmd 








P h P : r cmd 


P h e : 


T P(x) = T 


(App) 









r h /(e) : T cmd 

Fig. 4. Typing rules for high-level language 



20], and are given in Figure ^ 4; in the last rule, we assume that the procedure 
/ is defined by proc / (ai) = P; return. The typing rules exclude mutual and 
self-recursion; however it is possible to overcome this limitation at the price of 
further technicalities. 

3.2 Compilation 

The compilation function Cp : Prog — ^ Prog^ is defined in the usual way from 
a compilation function on expressions Ce '■ Expr — > Instr*, and a compilation 
function on commands Cc '■ Comm — >■ Instr*. Their formal definitions are given in 
Figure 5. In order to enhance readability, we use :: both for consing an element 
to a list and concatenating two lists, and we omit details of calculating pc in the 
clauses for while and if expressions. We also use to denote the length of a 
list. 



3.3 Preservation of Security Types 

Compilation preserves typing. 

Theorem 2. If F \~s P then F h Cp{P). 

The proof proceeds in two steps. First, we show how to compute from an expres- 
sion in the source language and its type, the type of the corresponding compiled 
code produced by the function Ce- By abuse of notation, we write se = r if 
se((/, j)) = T for every (/, j) G VP. 

Lemma 3. Assume e is an expression in P and F \~s e : t, and Cp{P) ^[i ... j] = 
Ce(e). For every csq G CS and st,se G ST s.t. se = t, there exists S'csq, si, se • 
{{f,k) :: csq \ i<k<j + l}^ ST — by abuse of notation, we often write 
S^^—s.t: 

^ For readability, we write h for hg. 
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Ce(x) 


— load X 


Ce(n) 


— prim n 


Ce(e op e!) 


— Ce(e) :: Ce(e') :: prim op 


Cc{x := e) 


= Ce(e) :: store x 


Cc(/ e) 


^ Ce(e) :: call / 


Cc(ci; C2) 


— Ce(ci) :: Ce(c 2 ) 


Cc(while e do c) 


= let h — Ce(e); h — Cc(c); x — #^2; y — in 

goto (pc X 1) :: I2 0 li '■ if (pc — x — y) 


Cc(if e then ci else C2) 


— let le — Ce(e); Ici — Cc(ci); lc2 — Cc(c2); 

X — i^lc2\y — i^lci in 

le :: if (pc + X + 2) :: lc2 :: goto (pc + p + 1) :: Ici 


Cpfproc f(x) c; return]* 


— [f := (store x :: Cc(c) :: return)]* 


Fig. 5. Compilation of expressions and commands 



1. for every cs,cs' G dom{S^), if cs i-i cs' then r,cs h ^^(cs) ^^(cs'); 

S^{{f,j + l) ::cso) = T::st,se. 

Proof. By structural induction on instructions. 

Second, we extend the result to commands. 

Lemma 4. Assume c is a command in P, and P \~s c \ t cmd, and 
Cp{P) j[i . . . j] = Cc(c). For every csq G CS and sto,seo G ST s.t. scq = t, 
there exists S'csq sto seo ’ P{ST) — by abuse of notation, we often write 

S^^—s.t: 

1. for every cs,cs' G dom{S‘^) and st,se G S'°(cs) s.t. cs cs', there exists 
st',se' G S"^(cs') s.t. P,cs h st,se st',se'; 

2. there exists st' s.t. st' = sto or st' = liftr st^, and for every cs G dom{S'^), 
cs' ^ dom{S‘^) and st,se G S'°(cs) s.t. cs i-s- cs', P,cs h st,se => st',seo. We 
write Co.sto.seo 

Proof. By structural induction on instructions. 

Proof (of Theorem 2). Set scq = r. By construction, the function ^ 

is defined for all cs s.t. (main, 1) : e i-G-* cs. It is then immediate to conclude. 



3.4 Recovering Non-interference for the Source Language 

One can also prove that compilation preserves operational semantics. 

Proposition 1 (Preservation of semantics). Letp be a program whose main 
procedure is [main := Cmain; return] and p \ X ^ V . If (cniaimP) P then the 
compiled program satisfies ((main, 1), p, e) (e, p, e). 
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By combining Proposition 1 and Theorem 2 we are able to recover the non- 
interference result for typable source programs. 

Corollary 1 (Non-interference for source language). Let P be a program, 
let r \ X ^ S and assume that P hs P. Then P is non-interferent w.r.t. P in 
the sense that for every p, p' , n, fj! : X — >■ V, 

{p P t\ (Cniaini p) ^ /i A (Cmaim P ) r M ) V /i r P 

Proof. By Proposition 1, ((main, 1), p, e) (e,p,e) and ((main, 1), p', e) 
(e,p',e). Furthermore P h Cp{P) by Theorem 2. Hence Cp{P) is non-interferent 
w.r.t. P by Theorem 1, and thus p p' by definition of non-interference. 



4 Conclusion 

We have shown how type systems can be used to enforce non-interference in 
a low-level language with procedures, and that one can define a security types 
preserving compiler from a high-level imperative language to such a low-level 
language. 



4.1 Related Work 

As emphasized in the introduction, static enforcement of non-interference 
through type systems is a well-researched topic, see e.g. [16] for a survey, and 
we can only comment on some of the most relevant literature. 



Procedures and exceptions. Non-interference for procedures and exceptions 
has first been studied (for high-level languages) by Volpano and Smith [19,18]. 
Improved type systems for exceptions have been studied by Myers for Java [12] 
and by Pottier and Simonet for ML [15,17]. 



Low-level languages. Lanet et al., see e.g. [5], develop a method to detect 
illicit flows for a sequential fragment of the JVM. In a nutshell, they proceed 
by specifying in the SMV model checker a symbolic transition semantics of the 
JVM that manipulates security levels, and by verifying that an invariant that 
captures the absence of illicit flows is maintained throughout the (abstract) 
program execution. Their analysis is more flexible than ours, in that it accepts 
programs such as yr ■= xh] Vl ■= 0. However, they do not provide a proof of 
non-interference. The approach of Lanet et al. has been refined by Bernardeschi 
and De Francesco, see e.g. [4], for a subset of the JVM that includes jumps, 
subroutines but no exceptions. 




14 



G. Barthe, A. Basu, and T. Rezk 



Type preserving compilation. Type preserving compilation has been thor- 
oughly studied in the context of typed intermediate languages, most notably for 
ML and Java, see e.g. [ 10 ]. Information flow types preserving compilation has 
been studied by Zwandewic and Myers in the context of A-calculus and CPS 
translation [ 21 ]. Also, Honda and Yoshida [ 9 ] consider type-preserving interpre- 
tations of higher-order imperative calculi with security types to 7r-calculus with 
security types. A similar result for resource control is being pursued in the MRG 
project^. 

Certifying compilation, as advocated by Proof Carrying Code [ 13 ], extends 
the idea of type preserving compilation by producing, from a certificate (i.e. a 
proof object) that a source program adheres to a property, a certificate that the 
compiled program adheres to a corresponding property, see e.g. [ 14 ]. 



4.2 Future Work 

Our work constitutes a preliminary investigation in the realm of certifying com- 
pilation for security properties, and may be extended in several directions. 

~ Language expressiveness: we would like to extend the results of this paper to 
more powerful languages that include objects and/or higher-order functions. 
We are particularly interested in scaling up our results to the sequential 
fragment of Java and of the JVM, building up on [2] for the former and on 
recent, unpublished, work by the authors for the latter. 

^ Generality: the main result of this paper is specialized to one particular 
compilation function that departs from standard compilers, e.g. by not being 
optimizing. We would like to isolate a set of constraints that guarantees 
preservation of typability for security types, and investigate the impact of 
standard compiler optimizations on security types. 

— Integrity: it is of practical interest, and we believe straightforward, to adapt 
our results to integrity. Indeed, weak forms of integrity guarantee that high 
variables may not be modified by a low writer, and are dual to confidentiality. 



References 

1. T. Ball. What’s in a region? Or computing control dependence regions in near- 
linear time for reducible control flow. ACM Letters on Programming Languages 
and Systems, 2(1-4):1-16, March-December 1993. 

2. A. Banerjee and D. A. Naumann. Secure Information Flow and Pointer Confine- 
ment in a Java-like Language. In Proceedings of CSFW’02. IEEE Computer Society 
Press, 2002. 

3. G. Barthe and B. Serpette. Partial evaluation and non-interference for object 
calculi. In A. Middeldorp and T. Sato, editors. Proceedings of FLOPS’99, volume 
1722 of Lecture Notes in Computer Science, pages 53-67. Springer- Verlag, 1999. 

^ See http : / /www . dcs . ed. ac .uk/home/mrg 




Security Types Preserving Compilation 



15 



4. C. Bernardeschi and N. De Francesco. Combining Abstract Interpretation and 
Model Checking for analysing Security Properties of Java Bytecode. In A. Cortesi, 
editor, Proceedings of VMCAF02, volume 2294 of Lecture Notes in Computer Sci- 
ence, pages 1-15, 2002. 

5. P. Bieber, J. Cazin, V.Wiels, G. Zanon, P. Girard, and J.-L. Lanet. Checking 
Secure Interactions of Smart Card Applets: Extended version. Journal of Computer 
Security, 10:369-398, 2002. 

6. A. Coglio. Simple Verification Technique for Complex Java Bytecode Subroutines. 
In Proceedings of FTFJP’02, 2002. 

7. J. Goguen and J. Meseguer. Security policies and security models. In Proceedings 
of SOSP’82, pages 11-22. IEEE Computer Society, 1982. 

8. N. Heintze and J. Riecke. The SLam calculus: programming with secrecy and 
integrity. In Proceedings of POPL’98, pages 365-377. ACM Press, 1998. 

9. K. Honda and N. Yoshida. A Uniform Type Structure for Secure Information Flow. 
In Proceedings of POPL’02, pages 81-92. ACM Press, 2002. 

10. C. League, Z. Shao, and V. Trifonov. Precision in Practice: A Type-Preserving 
Java Compiler. In G. Hedin, editor, Proceddings of CC’03, volume 2622 of Lecture 
Notes in Computer Science, pages 106-120. Springer- Verlag, 2003. 

11. X. Leroy. Java bytecode verification: an overview. In G. Berry, H. Comon, and 
A. Finkel, editors. Proceedings of CAV’Ol, volume 2102 of Lecture Notes in Com- 
puter Science, pages 265-285. Springer- Verlag, 2001. 

12. A.C. Myers. Jflow: Practical mostly-static information flow control. In Proceedings 
of POPL’99, pages 228-241. ACM Press, 1999. 

13. G. C. Necula. Proof-Carrying Code. In Proceedings of POPL’97, pages 106-119. 
ACM Press, 1997. 

14. G. C. Necula and P. Lee. The Design and Implementation of a Certifying Compiler. 
In Proceedings of PLDF98, pages 333-344, 1998. 

15. F. Pettier and V. Simonet. Information flow inference for ML. ACM Transactions 
on Programming Languages and Systems, 25(1):117-158, January 2003. 

16. A. Sabelfeld and A. Myers. Language-Based Information-Flow Security. IEEE 
Journal on Selected Areas in Comunications, 21:5-19, January 2003. 

17. V. Simonet. Fine-grained Information Flow Analysis for a Lambda-Calculus with 
Sum Types. In Proceedings of CSFW’02, pages 223-237, 2002. 

18. D. Volpano and G. Smith. A Type-Based Approach to Program Security. In 
M. Bidoit and M. Dauchet, editors. Proceedings ofTAPSOFT’97, volume 1214 of 
Lecture Notes in Computer Science, pages 607-621. Springer- Verlag, 1997. 

19. D. Volpano and G. Smith. Eliminating covert flows with minimum typings. In 
Proceedings of CSPW’97, pages 156-168. IEEE Press, 1997. 

20. D. Volpano, G. Smith, and C. Irvine. A Sound Type System for Secure Flow 
Analysis. Journal of Computer Security, pages 167-187, December 1996. 

21. S. Zdancewic and A. Myers. Secure information flow and CPS. In D. Sands, editor. 
Proceedings of ESOP’Ol, volume 2028 of Lecture Notes in Computer Science, pages 
46-61. Springer- Verlag, 2001. 




History-Dependent Scheduling for 
Cryptographic Processes 



Vincent Vanackere 

Laboratoire d’lnformatique Fondamentale de Marseille 
Universite de Provence, 

39 me Joliot-Curie, 13453 Marseille, France 
vanackere@cmi . univ-mrs . f r 



Abstract. This paper presents history- dependent scheduling, a new 
technique for reducing the search space in the verihcation of crypto- 
graphic protocols. This technique allows the detection of some “non- 
minimal” interleavings during a depth-first search, and was implemented 
in TRUST, our cryptographic protocol verifier. We give some experimen- 
tal results showing that our method can greatly increase the efficiency 
of the verihcation procedure. 



1 Introduction 

In recent years, several symbolic reduction systems have been introduced, that 
aim at the analysis of security protocols. Some examples of this approach include 
[7,1,4]. These symbolic methods have an advantage over more traditional model- 
checking approaches in that they allow for the exploration and verification of 
an otherwise infinite-branching system, making it possible to perform an exact 
analysis for systems with a finite number of cryptographic processes. 

However, the same problem remains as in most other model-checking tech- 
niques: as the number of parallel processes goes up, the number of possible 
interleavings makes the verification task harder - if not impossible in practice - 
because of the state-space explosion problem. 

In this paper we present history -dependent scheduling, a new reduction tech- 
nique that has been developed to be used in TRUST [10,11], our cryptographic 
protocol verifier. This technique allows, in a depth-first search setting, the detec- 
tion of some redundant - “non-minimal” - interleavings, and is also well-suited 
to symbolic transition systems, where few transitions actually commute and the 
usual reduction methods such as those of [6] do not apply. 

The paper is organised as follows. After the presentation of our formal model, 
we will give an intuitive overview of our reduction procedure. This procedure will 
then be formally described and proved in the next two sections. The paper will 
end with some experimental results and concluding remarks. 
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2 Model 

Our full model is presented in details in [2], and we will therefore only give a 
short version here. 

We work under the usual Dolev-Yao model [5] , where the network is under full 
control of an adversary that can analyse all messages exchanged and synthetize 
new ones. In our setting, messages can be viewed as terms in a free algebra - 
we work under the perfect encryption assumption - and we distinguish between 
basic names (agent’s names, nonces, keys, . . . ) and composed messages (pairs 
< _, _ > and encrypted terms E{_, _)), with the restriction that only basic names 
may be used as encryption keys. The set of names is denoted by J\f and the full 
set of messages by M. 



2.1 The Intruder’s Knowledge 

Given a set of terms T, we will denote the set of terms that the intruder may de- 
rive by /i(T). We assume a (computable) relation T> C Af x Af with the following 
interpretation: 

(C, C') G 21 iff messages encrypted with C can be decrypted with C' . 

We define Inv{C) = {C \ (C, C') G V}. Further hypotheses on the properties 
of T> allow to model hashing, symmetric, and public keys. In particular: (i) for 
a hashing key C, Inv{C) = 0, (ii) for a symmetric key C, Inv{C) = {C}, and 
(iii) for a public key C there is another key C such that Inv{C) = {C} and 
Inv{C) = {C}. 

Given a set of terms T we define the S (synthesis) and A (analysis) operators 
as follows: 

— S{T) is the least set that contains T and such that: 

ti,t2 G S{T) (^ 1 ,^ 2 ) G S{T) 

h G S(T),t2 GTnAf ^ E{ti,t2) G S{T) . 

— A{T) is the least set that contains T and such that: 

{ti,t 2 )GA{T) ^ ti G Air), i = 1,2 

E{ti, t^) G A{T), A{T) n Inv{t 2 ) 7 ^ 0 ^ ti G A{T) . 

As an example, if T = {E{{A,B),K),K~^} then A{T) = T U {A, B, {A, B)} 
and we have E{A,K~^) G S(A(T)). 

With the above definitions, the knowledge that the intruder may derive from 
T is fc(T) = S(A(T)). It should be noticed that the knowledge ii(T) is infinite 
whenever T is non-empty, and that it increases monotonically with T. 
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2.2 Processes and Configurations: Semantics 

In our framework, a protocol is modelled as a finite number of processes interact- 
ing with an environment. Our process syntax includes the parallel composition 
- commutative and associative - of two processes, and thus we define a con- 
figuration fc as a couple (P, T) where P is a process and T is an environment; 
the environment is a set of terms, composed of the initial knowledge augmented 
with all the messages emitted by the participants of the protocol so far. In the 
following, the adversary knowledge in a configuration k = (P, T) will be denoted 
by ii{k) = 

Figure 1 gives the semantic rules as a reduction system on configurations. 
Informally, a process can either: 

— (!) Write a message: the term is added to the environment knowledge. 

— (?) Read some message from the environment: this can be any message the 
adversary is able to build from its current knowledge. 

— (d) Decrypt some (encrypted) term with a corresponding inverse key. 

— (pi) Perform some unpairing (the symmetric rule (pr) is not written). 

— (mi, m 2 ) Test for the equality /inequality of two messages. 

— (a) Check if some assertion tp holds in the current configuration. 



(!) 


{H.P 1 P',T) 


(P| P',TU{t}) ifteAt 


(?) 


{lx.P 1 P',T) 


^i[t/x]P\P',T) ift€/i(T) 


(d) 


{x ^ dec(P(t, C), C').P 1 P', T) 


i[t/x]P 1 P',T) if C" e Inv{C),te M 


(Pl) 


(x^proj,((bO).P|P',r) 


i[t/x]P P',T) if t,t' eM 


(a) 


(assert(:p).P | P',T) 


^Up\p',t) 

\ err if [Ay 


(mi) 


([t = t]Pi,P2 1 P',T) 


^ (Pl 1 P',T) iit€M 


(m 2 ) 


([t = t']Pi,P2 1 P',T) 


(P 2 1 P',T) G M 



Fig. 1. Reduction on configurations 



Missing from the figure is the terminated process, denoted by 0, as well as the 
syntax of the assertion language, that will be presented in the next section, err 
denotes a special configuration that can only be reached from a false assertion. 

In our model, a correct protocol is a protocol that cannot reach the err con- 
figuration - or, put in other words, a protocol such that all assertions reachable 
from the initial configuration of the system hold. 

This reachability problem was shown to be NP-complete [2,9]. 

2.3 The Assertion Language 

The full assertion language we consider is the following: 

(f ::= true \ false \ t = t'\t^t'\ known(t) | secret(t) | A 1^2 | ipi V 1^2 




History-Dependent Schednling for Cryptographic Processes 



19 



This is equivalent to saying that we consider arbitrary boolean combinations 
of atomic formulas checking the equality of two messages {t = t') and the secrecy 
of a message (secret(t)) with respect to the current knowledge of the adversary. 
As shown in [2], this language allows to easily express authentication properties 
such as aliveness and agreement [8] . An actual example of a specification using 
this assertion language is given in Appendix A. 

The valuation of an assertion formula in an environment T is the “intuitive” 
one: for example, we have \=t secret(t) iff t ^ The only property on 

assertions that is used in this paper is the fact that the truth value of an assertion 
in an environment T depends only on the knowledge fJ,(T). 

2.4 Symbolic Reduction and Challenges for Practical Verification 

The main difficulty in the verification task is the fact that the input rule is 
infinitely branching as soon as the environment is not empty. In [1,2] it was 
shown that it is possible to solve this problem by using a symbolic reduc- 
tion system that stores the constraints in a symbolic shape during the exe- 
cution. As an example, the input rule {lx.P,T) — >• {[t/x]P,T),t G /r(T) becomes 
{lx.P,T,E) — >• {P,T,{E; x : T)). The complete description of the symbolic re- 
duction system can be found in [2] . The main property we rely on is the fact that 
the symbolic reduction system is in lockstep with the ground one and provides a 

- sound and complete - decision procedure for processes specified using the full 
assertion language described in Sect. 2.3. 

Although the symbolic reduction system is satisfying from a theoretical point 
of view, an inherent limitation is that it does not handle iterated processes (as 
the general case for iterated processes is undecidable). Thus, in order to verify a 
protocol against replay attacks and/or parallel sessions attacks, it is important 
to handle cases where there is a finite - even if small - number of participants 
playing each role. 

Of course, as the number of parallel threads goes up, the number of possible 
interleavings make the verification task harder - if not impossible in practice 

- because we face the classical state explosion problem. Usual model-checking 
techniques to handle this problem involve the use of partial-order reduction [6] , 
but unfortunately the symbolic reduction system doesn’t lend itself very well to 
this approach, as different sequences of symbolic transitions will usually not lead 
to the same symbolic state. 

We will tackle this problem by using a different approach: the basic idea 
is that by monitoring the inputs/outputs of the processes, we will be able to 
explore only some “complete” set of “interesting” interleavings. Due to the rel- 
ative complexity of our reduction procedure, we will start by giving an intuitive 
overview in the next section. 

3 A Scheduler for Cryptographic Processes 

This section provides the intuition behind our reduction procedure. First, we 
recall the most important facts about our system: 
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~ We consider a parallel composition of n processes. 

— Processes can only interact through the environment. 

— Each reduction step occurs only on one process, and its effect only depends 
on the environment knowledge. 

— The environment knowledge can only increase along the reduction. 

We will tackle our verification problem by using the following point of view: 

~ To each process of the parallel composition is associated a unique “nice level” 
(in the following, we will, without loss of generality, assume that the nice 
level of a process is its number in the parallel composition). 

~ The verification procedure is performed by a scheduler which is in charge, 
at each step of the reduction, of choosing the next process to be reduced, 
based on information on the past of the reduction. The scheduler may also 
choose to stop the exploration of the current branch. 

This scheduler is similar to the one in a usual operating system, in the sense 
that it makes use of information on the past to make his decisions; the analogy 
stops here because our verification scheduler does not aim at achieving fairness 
or latency, but only at completeness. Moreover, it is also non-deterministic and 
has the possibility to sequentially explore several branches starting from the 
same configuration. Obviously, if we take as a scheduler the one exploring all 
possible branches at each step of the reduction, our description is just another 
name for a depth-first search (but, of course, we will try to be more clever...). 

Our reduction procedure can now be (informally) illustrated as a scheduler 
that does the following: 

Eager reduction: If the scheduler has selected some process for reduction, then 
this process may as well remain selected until it modifies the environment 
knowledge, else the reduction done on this process would not be able to affect 
the possible reductions of the other processes. From now on, we will call 
“eager step of reduction” any succession of reductions on the same process 
where the last step strictly increases the environment knowledge, and we will 
assume that the scheduler only performs eager steps of reduction. 

Minimal traces: Whenever the scheduler stops reducing some process and 
then, later, decides to resume the reduction of this process, it will expect 
that the new eager step of reduction: 

1. could not have happened at the time the reduction on this process was 
stopped (“no late work” clause); 

2. could not have happened before the reduction of another process with a 
higher nice level (“priority” clause). 

If any of those 2 conditions does not hold, the scheduler will simply stop the 
exploration of the current reduction. 

We will now focus at formally defining and proving this reduction procedure. 
Section 4 will focus on the eager reduction procedure (this notion was already 
briefly described in [10]). Then, in Sect. 5 we will introduce a notion of minimal 
traces and give some criterions allowing the detection of non-minimal traces. 
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4 Eager Reduction 

This section focuses on the description of the eager reduction procedure, that 
will be proven correct and complete w.r.t. the original reduction system. 

In the following, we study the reduction of a configuration (Pi | . . . | Pm, T), 
denoted by (7T Pi,T). We will not allow the rewrite of P | Q as Q | P, and 
therefore we can define the relation as a reduction on the x-th process of the 
parallel composition. The variables k,k' , . . . will be used for configurations: we 
recall that if fc = (iTPi,T), then ii{k) is defined as the environment knowledge 
in that configuration (/x(fc) = fi{T) = S{A{T))), and if fc = err we set /i(err) = 0. 

Reduction steps that do not modify the environment knowledge will play an 
important role in the following, and will be denoted as “silent” : 

Definition 4.1 (Silent reduction). 

We will note k^k' iff k ^ k' and fJ,{k') = fx{k). 

The following commutation lemma for silent reductions, whose proof can be 
found in Appendix B, plays a crucial role in our reduction procedure: 

Lemma 4.1 (Commutation lemma). 

jf ■ , ■ {!) k ■ -^j err k -^j err 

( 2 ) k k' ^ err ^ k k' 

A corollary of this lemma is the fact that if we consider a sequence of silent 
reductions k — >■ *k' , then we can choose - in the path from k to k' - any order 
to reduce the processes of the parallel composition. 

We will now annotate sequences of reductions to include the order in which 
the different processes modify the environment knowledge. In the following, we 
will write — » as a shortcut for — >■*. In a similar way, we will write — »a:= (~^£c)* 
for a sequence of reductions occurring on the process x. 

Definition 4.2. We write: 



(1) k^k' iff 

(2) k^k' iff 

(3) k "k' iff 

T ^ ^ X 

The relation — » stands for any sequence of silent reductions ; — » means that 
the environment knowledge is modified (increased) exactly once, by the process 

X, during the reduction. Notation — » is only syntactic sugar: intuitively, 

the sequence {x \, . . . , x„} represents the order in which the processes bring new 
information to the environment. 

Remark f.l (An output can hide behind another . . . ). 

fCi ,. . . ,Xn 

li k ^ k then there exists a sequence {xi, . . . ,Xn} such that k k . 

s 

It should be noted that the set of sequences s satisfying k ^ k is not related 



k ^ k' and pi{k') = p.{k) 

T T 

k ^ k' and fi(k') ^ fi{k) 

Xi X2 , , 

k ^ ^ ^ k 




22 



V. Vanackere 



to Mazurkiewicz traces^; as an example, if fc = {\A.Pi \ \{A, B) .P 2 , {i?}) and 
k' = (Pi I P 2 , {A,B, (A,B)}), both the sequences {1} and {2} satisfy k ^ k' 
(after the output from either one of the processes, an output from the other one 
will not bring any new information to the environment). 

Definition 4.3 (Eager reduction). 

We define ^x, a step of eager reduction on the process x, as: 

k k' iff k ~^x‘ ~^x k' and p,{k') pL{k) . 

The eager reduction relation is then defined by •^= IJ ^x ■ 

X 

By extension, we will write k ^xi,...,xn whenever k ‘ ^x^ ••• ^x„ 
k'. 

Thus, during a step of eager reduction, we reduce only some process x of the 
configuration until it increases the environment knowledge or reaches error: this 
formal definition is the one corresponding to rule (1) of our scheduler. 

By using Lemma 4.1, we can establish the following theorem on eager reduc- 
tion (the proof can be found in Appendix C) : 

Theorem 4.1. 

,Xn T 

1. k k' err => k ^xi,... ,x„ ■ ^ k' 

XXf-. ,Xn 

2. k err A: err 

We can now state the soundness and completeness of the eager reduction: 

Theorem 4.2. 



k — >■* err iff k err 

Proof. Completeness follows from Theorem 4.1(2): if A: ^ err, then there exists 

Xi,... ,Xn 

{xi,... ,Xn} such that k —>* err, and thus k ^xi,...,x„ srr, which means 
k '^* err. Soundness comes trivially from ■^C— >•*. □ 

5 Characteristics and Minimal Traces 

In this section we will show how it is possible to avoid the full exploration of all 
eager traces by exploiting some information on the past of a trace. It is namely 
possible, by monitoring the values taken in input, to detect, at the end of its 
execution, that some step of reduction could have occurred earlier in the trace; 
by using a suitable order on traces (Definition 5.2), we will show that we can 
cut this branch of the search space whenever we detect that the current trace 
cannot be a minimal one. 

In all this section, we consider an initial configuration k such that k — >■* err 
and we will show the existence of a reduction sequence from k to err verifying 
some particular properties. 

^ We recall that two words/traces are Mazurkiewicz-equivalent iff they can be obtained 
by permntations of independent letters/transitions. 
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5.1 Definitions 

Definition 5.1 (Traces and characteristics). 

An eager trace from k to k' is a sequence k ^ k\ ^ ^ k' . 

We will denote k ^pm k' whenever k ^p ■ ■ ■ ^p k' with m eager steps. 

Any eager trace can he uniquely written as: 

k ,p„™»» k' with Vi pi ^ Pi+i and rrii > 0 . 

The characteristic of this trace is then defined by the tuple 
In the following, we will often write “trace” as a shortcut for “eager trace”. 
We now introduce a way to compare the traces characteristics: 

Definition 5.2 (Partial order on characteristics). 

The relation < is defined as: 






iff 3i G {1, . . . ,min(n,n')} 



(Vj <i {Pj,mj) = (p'j.m'j) 

1 Pi < Pi V {pi = Pi Ami> to ') 



Example 5.1. li a < b < c, then {of, b^, c^) ^ (a^, c^) ^ (a^, c^, b’^). 

The introduction of this particular relation could appear counter-intuitive, 
due to condition to^ > to' . . . The informal explanation is the following: the 
purpose of this order is to set as minimal the traces for which stopping the 
reduction on some process implies that the next reduction on this process must 
necessarily depend on what happened “in-between” . Then, if k k' 

and k ‘ k', the first sequence will have a smaller characteristic 

than the second one (and should ideally be favoured by our scheduler). 

It should be noted that our relation makes use of an arbitrary order on the 
processes - their number - that corresponds in fact to the “nice levels” of our 
informal scheduler description. 

Lemma 5.1. -< is a partial order. 

Remark 5.1. In fact, two characteristics are always comparable unless one is the 
prefix of the other. 

The following lemma will allow us to establish the existence of minimal traces 
leading to error: 

Lemma 5.2. The set of all characteristics associated to some non-empty set of 
traces from the same initial configuration admits some minimal elements. 

Proof. Although ^ is not well founded this property simply holds because there 
is only a finite number of possible characteristics for all the traces starting from 
a given initial configuration k (namely if W is the total number of output in- 
structions contained in k, any eager trace will be at most of length W -\- 1). □ 

Definition 5.3 (Minimal traces). 

We consider the set of all characteristics associated to all traces from k to 
err.- this set admits minimal elements, and any trace from k to err whose char- 
acteristic is minimal will he called minimal. 



As a consequence, finding criteria for non-minimal traces will allow us to 
avoid the full exploration of all traces by the scheduler. 
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5.2 Criteria for Non-minimal Traces 

In this section, we will develop a characterisation of some non-minimal traces, 
based on the observation of the values taken by the input variables of the pro- 
cesses. This will require a new notation: 

Definition 5.4. Let X he a set of terms. We denote: 

i?x.P,T) H {[t/x]P,T) iff tGS{A{T))nX 



By extension, we will also denote k k' when all the input values in the 
reduction from k to k' are included in the set X. 

The following - crucial - lemma is proven in Appendix D: 

Lemma 5.3 (Eager commutation). 

I I 

If i j> X <Z fi{k) and k err, then k err. 

We can now state our fundamental theorem, that gives two sufficient condi- 
tions for the non-minimality of a trace: 

Theorem 5.1 (Non-minimality). Any trace k\ ••• err con- 

taining a sub-trace of one of the following forms is not minimal (we denote 

Xi = fi{ki)): 



(1) ki ^p^^i ki+i ■ ■ ■ ^ p fc' yf err 

p = Pi, p^ { k + 1 ,... ,Pj} 

(2) ki -^p k' ^ err 

p<Pt, p^ {pi,... ,Pj} 

Proof. 



(1) By iterating Lemma 5.3, we show ki err. Therefore there 

exists some trace of characteristic (pi™L . . . ,Pi^'^'^, . . . ) leading to error, 
with q> 1. However, by definition of ^ we have (pi"*S ■ • ■ ■ . ■) P 

{pi^^ , . . . ,Pi^' , . ■ .), thus the non-minimality of our initial trace. 

(2) By iterating Lemma 5.3, we get ki ^p ■ err. If i > 1 and p = Pi-i, we 

are back to the previous case, else there exists some trace with characteristic 
(pi"*L • • • . . . ) leading to error, with q > 1. As p < pi, we 

have (pi™L ■ • ■ • ■ • ) ^ • ■ • jPi"^% ■ • ■ )) our initial trace is not 

minimal. 
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5.3 Configurations with History 

Theorem 5.1 shows that by having the relevant information on the past of the 
current trace, our scheduler could easily detect - and discard - non-minimal 
traces. We will therefore enrich configurations with a third component, the his- 
tory, which will be a list of triple; for each reduction sequence on one process, we 
will indeed record the process number, the current environment just at the begin- 
ning of the reduction on this process, and the list of all the values that were read 
during the sequence. Formally, we write: % = (po,To,Xo) : ■■■ : (p„,T„,X„), 
where each pi is a process number, an environment and a set of terms. 

We define a function append^ for updating a history as follows: 

Definition 5.5. //"H = (po,To,Xo) : ••• : (p„,T„,X„), then: 

- ifp = p„, appendjj('H,p,T,X) = (po,To,Xo) : • • • : (p„,T„,X„ U X) 

- tfPy^Pn, append^('H,p,T, X) = (po,To,Xo) : ••• : (p„,T„,X„) : {p,T,X) . 

Reduction rules are then modified to integrate a history: 



— the input rule becomes: 



{lx.p I p',T,n) {[t/x]p I p',T,n') if 



ppend^(-H,z,T, {f}) 



for all the other rules, if (P,T) -^i {P' ,T') then: 

(P, T, H) (P', r, n') if n' = append^(P, i, T, 0) . 



Put in other words, each time the scheduler will choose a new process for reduc- 
tion, it will start a new section of the history and record the current environment; 
this section will then be updated with all the values that are read by this process 
during its execution/reduction. 

We can now state our final theorem, giving the two conditions that will be 
checked, after each eager step, by the scheduler in order to discard non-minimal 
traces: 

Theorem 5.2 (Non-minimal history). We consider a trace k k' yf err 

such that the history "H(fc') = {pq,Tq, Xq) : • • • : (p„,T„,X„) satisfies one of the 
following properties: 



{ Pi = Pn 

Pn ^ {Pi-l-l, ■ • • ,Pn-l} 

Xn Q 

{ Pi > Pn 

Pn ^ {Pi-l-1, ■ • • ,Pn-l} 

Xn Q p{Ti) 

Then this trace cannot he the prefix of a minimal one. 

Proof. 'H(k') verifies either (1) or (2) and therefore the eager reduction to k' will 
verify the corresponding condition from Theorem 5.1. As a consequence, any 
trace having the one to k' as a prefix cannot be a minimal one. 
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6 Experimental Results 

We have implemented our history-based reduction procedure in our verifier, 
TRUST. Figure 2 gives some times for the full analysis of some typical pro- 
tocols: the measures were taken on an Athlon XP 1800, and all times are in 
seconds (the time spent by the verifier is roughly proportional to the number 
of basic reduction steps that are done). For each protocol, we detail the num- 
ber of roles involved and give the times to do a full search depending on the 
number of parallel - interleaved ~ sessions^. Tq is the time when the usual re- 
duction method is applied {i.e. the scheduler reduces each process until it emits 
an output); Tsager is the time for the eager reduction procedure alone; TMin is 
the time without eager reduction, but using the history to detect non-minimal 
traces; then TEager-i-min is the time for the full reduction procedure presented 
in this paper. As can be seen in the figure, our reduction method reveals itself 
quite effective in practice: sometimes a reduction factor of 60 is gained, and we 
did not encounter any case where the added checks (for non-minimality) made 
history-dependent scheduling slower. 



Protocol 


^ roles 


# sessions 


To 


^Eager 


Tivlin 


^Eager+Min 


Needham-Schroeder-Lowe 


2 


3 


50 


6.80 


1.18 


0.59 




2 


4 


? 


2034 


99 


41 


Needham-Schroeder-Lowe^ 


3 


2 


277 


2.38 


1.04 


0.16 




3 


3 


? 


963 


163 


8.46 


Otway-Rees 


3 


2 


0.75 


0.32 


0.28 


0.14 




3 


3 


5879 


722 


497 


98 


Carlsen 


3 


2 


6.15 


4.79 


1.29 


0.91 




3 


3 


? 


7 


7 


2272 


Kerberos v5 


4 


2 


5.40 


4.11 


2.31 


1.93 



Fig. 2. Times for the analysis of various protocols (in seconds). 



7 Conclusion 

In this paper we have presented history-dependent scheduling, a new reduction 
method for cryptographic protocols that is based on the monitoring of the in- 
puts/outputs performed by the processes during the reduction. This method 
relies on the two following techniques: 

^ A question mark instead of the time means that we were not “patient” enough to 
wait for the end of the verification procedure. 

® This is the seven-message version of the protocol, making use of 3 key servers in 
parallel. 
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— Eager reduction, that was already presented in [10], is a natural big-step 
semantics based on the outputs of the different processes. 

— Detection of non-minimal traces, through the use of a history of the current 
reduction sequence, allows to stop the exploration of some traces and is based 
on very simple criteria on the inputs of the processes. 

As far as related work is concerned, our technique seems to share some com- 
mon grounds with the one developed independently in [3], in that both methods 
somehow aim to reduce the search space by verifying/maintaining additional 
constraints on the input values; however, differences between the formalisms 
make the comparison a non-trivial task, that we have to leave as a future work. 

It should be noted that we have only defined and proved here our method on 
the ground reduction system; this method can be adapted in a straightforward 
manner to the symbolic system, and the same non-minimality conditions are 
then checked at the symbolic level (although we should mention that the proof 
of completeness for the symbolic case becomes more complex, due to some slight 
differences between the symbolic eager reduction and the ground one). 

As practical experiments show, history-dependent scheduling can be quite 
effective in practice - and in all our tests never induces any slowdown. We 
expect that the ideas behind this method are general enough to be applied to 
other verification systems, and also stress the fact that this technique is, by its 
nature, especially well suited for verification in a depth-first setting. 
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A Specifying Security Properties through Assertions 

We take as an example the following 3 message version of the Needham-Schroeder 
Public Key protocol: 



A ^ B : {na, A} pui,{B) 

B ^ A: {na,nb}pub(A) 

A ^ B : {nb}pub(B) 

In our framework, the protocol can be modeled as follows: 



Init{myid,resp) : fresh na. 

\E{{na, myid), Pub{resp)). 

?e. (na',nb) <— dec{e, Priv{myid)). [na' = no]. 

\E{nb, Pub{resp)). 

assert(secret(n6) A auth(resp, myzd, na, n6)). 0 

Resp{myid,init) : ?e. {na,a) -<r- dec{e, Priv{myid)) . [a = init]. 
fresh nb. 

^•auth{myid,init,na,nb) E{^{na^ Bp) ^ Pub{inity). 

le'. \e! = E{nb, Pub{myid))]. 

0 

The assertion “auth(ms(/)” is a shortcut for “known(if(TOS(/, KTauth))”) and in- 
struction “!aath(msg)i” IS some Syntactic sugar for “\{E{msg,Ksuth),ty^ ■ In this 
example, the initiator specifies that at the end of its run of the protocol the 
nonce nb must be secret, and expects an agreement with some responder on the 
nonces na and nb. 



B Proof of Lemma 4.1 

(1) We have k A-i k' -Aj err. The reduction on j is a false assertion, and as we 
have fj,{k) = p{k') and as the truth value of an assertion only depends on 
the environment knowledge, the same assertion will also evaluate to false in 
the configuration k. 

(2) We have ki -4i k 2 -Aj k^ (fci = k and k^ = k'). The proof is done by 
basic case analysis on the rules ri and f 2 - ■ All rules but (?), (a) and (!) do 
not depend at all from the environment nor modify, it and thus the result 
holds whenever {(a), (?), (!)} or {(a), (?), (!)}. Commutation when 
ri = (a) or r 2 = (a) is always possible, given the fact that a process may 
choose not to evaluate an assertion. There remains only 4 cases: 

? ? 

1. k\ —>i /c2 -^j kz 

\ \ 

2. k\ /c2 — k^ 

? ! 

3. —^i /c2 — k^ 

! ? 

4. ki —>i /c2 ~^j kz 
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Cases (1), (2) and (3) are straightforward"^. Case (4) is the interesting one 
(and the one where our eager reduction procedure will take advantage): 
namely we can perform the input rule first and then reach ks after an output 
from the process number i, due to the fact that the input rule only depends 
on the knowledge which is the same as ^(fci) since ki k 2 - □ 



C Proof of Theorem 4.1 

We simultaneously prove (1) and (2) by induction on n. 

Case (n = 1). We have to establish: 

(1) k ^ k' ^ err k k' 

X 

(2) k ^ err ^ k err 

Both properties are easily shown by iterating Lemma 4.1 in order to move 
all reductions on the process x at the beginning of the reduction sequence. 
Case (n > 1). By induction: 

(1) fc — » fc' implies fc — » fc" —» fc' and by induction hypothesis we 

know that: ^ 

^kyi—\. k '^~^xi,... ,xn-i kn—1 k 

Then fc„_i — » k — » k , which means fc„_i — » k , and by the result for 

r 

case n = 1 we can deduce 3k„. kn-i kn k ■ 

Xi,... ,X„-1 x„ 

(2) We have k k ^ err and by induction hypothesis: 



3k".k^xu...,x„.^ k" 



k' 



err . 



Then k" -» err and thus, by the result for case n = 1, k" ‘^x^ which 
ends the proof. □ 



D Proof of Lemma 5.3 

First, we will extend Lemma 4.1 with the two following properties {i yf j): 

(1) If fc — k' and X C ^{k) then k k' 

(2) If k k\ — k' and n{k\) yf ^J-{k') then 3/c2- k — /c 2 — k' A /r(fc) yf /i(fc 2 ) 

By iterated application of (1), (2) and Lemma 4.1, we can now move the 
reduction steps on the process j at the beginning of the sequence and get: 

3{k\ k"). k k' — k” -^i ■ err A k^{k”) ■ 

X \ X 

As ^i(k') yf ^(fc"), k ->*j k' k” implies 3q > 1. k k” . On the other 
hand, we know by the completeness of eager reduction that k” err. Therefore 

X \ X \ 

k srr, which implies k err. □ 

^ It should be noted that (3) is “folklore” and used very broadly in the literature; we 
can summarize it as: “if we have in a parallel one process doing an input and another 
one an output, the output can always be done hrst without any loss” . 
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Abstract. Typed Assembly Languages (TALs) can be used to validate the safety of 
assembly-language programs. However, typing rules are usually trusted as axioms. 
In this paper, we show how to build semantic models for typing judgments in 
TALs based on an induction technique, so that both the type-safety theorem and 
the typing rules can be proved as lemmas in a simple logic. We demonstrate this 
technique by giving a complete model to a sample TAL. This model allows a typing 
derivation to be interpreted as a machine-checkable safety proof at the machine 
level. 



1 Overview 

Safety properties of machine code are of growing concern in both industry and academia. 
If machine code is compiled from a safe source language, compiler verification can ensure 
the safety of the machine code. However, it is generally prohibitive to do verification 
on an industrial-strength compiler due to its size and complexity. In this paper, we do 
validation directly on machine code. Necula introduced Proof-Carrying Code (PCC) 
[1], where a low-level code producer supplies a safety proof along with the code to 
the consumer. He used types to specify loop invariants and limited the scope of the 
proof to type safety. Typed Assembly Language (TAL) [2] by Morrisett et al. refined 
PCC by proposing a full-fledged low-level type system and a type-preserving compiler 
that can automatically generate type annotations as well as the low-level code. Once an 
assembly-language program with type annotations is type checked, the code consumer 
is assured of type safety. 

Take a typing rule from the Cornell TAL [2], 

h r : V[].L' Z\ h T < C' 

W; A; r h jmp r 

which means that a “jump to register r” instruction type-checks if the value in r is a 
code pointer with precondition F', and the current type environment L is a subtype of 
r' . This rule is intuitively “correct” based on the semantics of the jump instruction. (In 
this paper we won’t be concerned with F, A, etc. of the Cornell TAL; we show this rule 
just to illustrate the complexity of that system’s trusted axioms.) 

* This research was supported in part by DARPA award F30602-99-1-0519 and by NSF grant 
CCR-0208601. 



B. Steffen and G. Levi (Eds.): VMCAI 2004, LNCS 2937, pp. 30^3, 2004. 
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In the Cornell TAL system and its variant [3] , such typing rules are accepted as 
axioms. They are a part of the Trusted Computing Base (TCB). However, low-level type 
systems tend to be complex because of intricate machine semantics. Any misunderstand- 
ing of the semantics could lead to errors in the type system. League et al. [4] found an 
unsound proof rule in the Special! [5] type system. In the process of refining our own 
TAL [6], we routinely find and fix bugs that can lead to unsoundness. 

In systems that link trusted to untrusted machine code, errors in the TCB can be 
exploited by malicious code. A more foundational approach is to move the entire type 
system out of the TCB by proving the typing rules as lemmas, instead of trusting them 
as axioms; also by verifying the type-safety theorem: type checking of code implies the 
safety policy, or the slogan — well-typed programs do not go wrong. 

1.1 A Foundational Approach 

In the type-theory community, there is a long tradition of giving denotational semantics 
[7] to types and proving typing rules as lemmas. Appel and Felty [8] applied this idea to 
PCC and gave a semantic model to types and machine instructions in higher-order logic. 
In the following years, semantic models of types have been extended to include recursive 
types [9] and mutable references [10]. With these models, it is possible to reason locally 
about types and operations on values, but unfortunately no model has been provided to 
typing judgments such as if'; Z\; L h jmp r and no method is provided to construct the 
safety proof for an entire program. 

The main contribution of this paper is to use a good set of abstractions to give models 
to judgments so that both typing rules and the type-safety theorem can be mechanically 
verified in a theorem-proving system. Our approach is truly foundational in that only 
axioms in higher-order logic and the operational semantics of instructions need to be 
trusted; it has a minimal trusted base. Our approach can be viewed as a way to map a 
typing derivation to a machine-checkable safety proof at the machine level, because each 
application of a typing rule in the typing derivation can be seen as the use of a (proved) 
lemma. 

1.2 Model of TALs 

In this section, we give an informal overview of our model of a typed assembly language, 
particularly an induction technique to prove program safety. We will not refer to any 
specific TAL in this section. 

A TAL program consists of two parts: code and type annotations. As an example, the 
following code snippet has a number of basic blocks; each one is given a label (like /q) 
and is composed of a sequence of instructions. Each basic block has an associated type 
annotation (like 0o) that is the basic block’s precondition. 

00 h '■ add 1, 1, 2 

jmp (3 

01 (i:ld2,3 

.... 
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Type annotations are generated by a type-preserving compiler from source language 
types. They serve as a specification (types as specifications). For instance, if 4>q is 
{ri : int} fl {r 2 : box (int)}, it expresses that register one is of type integer and register 
two is a pointer to an integer. 

Since we need to show the type-safety theorem, the first ingredient is to define what 
the safety of code means. Following the standard practice in type theory, we define code 
is safe if it will not get stuck before it naturally stops. Note the trick here is to define 
the machine’s operational semantics in such a way that the machine will get stuck if 
it violates the safety policy (see Section 2). To simplify the presentation, we treat only 
nonterminating programs that do not stop naturally. ' 

We first informally explore how to prove a TAL program is safe. We define “a label 
I in a state is safe for k steps” to mean that when the control of the state is at I, the state 
can run for k steps. The goal then is to prove that if entry condition (j>Q holds, label Iq is 
safe for k steps for any natural number k. A natural thought is to show it by induction 
over k. The base case (k = 0) is trivial; the inductive case is to show label Iq is safe for 
k + l steps given that it is safe for k steps. But at this moment we have no idea in which 
state the code will be after k steps, so we cannot prove that the state can go ahead one 
more step. 

The solution is to do induction simultaneously over all labels, i.e. prove each label k 
is safe for k + l steps with respect to its precondition (f)i, assuming all labels Ij are safe 
for k steps with respect to (j)j . Let us take label Iq in the example to see why this new 
induction works. Basic block Iq has length two, and ends with a jump to label ( 3 , which 
has been assumed to be safe for k steps, provided that precondition is met. Suppose 
by inspecting the two instructions in block Iq, we have concluded that Iq is safe for two 
steps and after two steps will be true. Combined with the assumed k-step safety of 
label Iq, label Iq is safe for fc -f 2 steps, which implies that it is safe for fc -f 1 steps. 

In this proof method, we still need to inspect each instruction in every block. For 
example, in block Iq, we check if the precondition (j)Q is enough to certify the safety of 
its two instructions and if 4>q will be met after their execution. What we have described 
essentially is a proof method to combine small proofs about instructions into a proof 
about the safety of the whole program. In the rest of this section, we informally give 
models to typing judgments based on this technique. Before that, we first motivate what 
kind of typing judgments a TAL would usually have. 

To type check a TAL program, the type system must have a wellformedness judgment 
for programs. Since a TAL program is composed of basic blocks, the wellformedness 
judgment for programs requires another judgment, a wellformedness judgment for basic 
blocks. Similarly, the type system should also have a wellformedness judgment for 
instructions. 

The model of the wellformedness judgment for programs can be that all labels 
are safe with respect to their preconditions. In the following sections, we will develop 
abstractions so that this model can be written down in a succinct subtyping formula: 
L\(C) C r. The model of wellformedness of a basic block can be that this particular 

* Every program can be transformed into this form by giving it a continuation at the beginning 
and letting the last instruction be a jump to this continuation. The continuation could return to 
the operating system, for example. 
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basic block is safe for k + 1 steps assuming all the other basic blocks are safe for k steps. 
Based on the induction technique and this model, we can prove the typing rule that 
concludes the wellformedness of a program from the wellformedness of basic blocks. 
The model of wellformedness of instructions is similar to the one of basic blocks and 
we do not go into details at this stage. 

In Section 2, we present the model of a RISC architecture (Sparc) and formally 
define the safety of code based on a particular safety policy (memory safety); Section 3 
shows the syntax of a sample TAL; Section 4 shows an indexed model of types and its 
intuition. The material in these three sections has been described by other papers [1 1,6, 
9] as part of the foundational PCC project; we briefly sketch them to set up a framework 
within which our proof method can be formally presented in section 5. 



2 Safety Specification 

Our machine model consists of a set of formulas in higher-order logic that specify the 
decoding and operational semantics of instructions. Our safety policy specifies which 
addresses may be loaded and stored by the program (memory safety) and defines what 
the safety of code means. Our machine model and safety policy are trusted and are small 
enough to be “verifiable by inspection”. A machine state (r, m) consists of a register 
bank r and a memory m, which are modeled as functions from numbers to contents 
(also numbers). A machine instruction is modeled by a relation between machine states 
(r, m) and (r', m') [1 1]. For example, a load instruction (Id) is specified by^ 

Id s, d = Ar, m, r', m'. r'{d) = m{r{s)) A (V® ^ d. r'{x) = r(a:)) A 

m! = m !\ readable(r(s)) 



The machine operational semantics is modeled by a step relation i-A that steps from 
one state {r,m) to another state {r' ,m') [11], where the state is the result of 

first decoding the current machine instruction, incrementing the program counter and 
then executing the machine instruction. 

The important property of our step relation is that it is deliberately partial: it omits 
any step that would be illegal under the safety policy. For example, suppose in some 
state (r, m) the program counter points to a Id instruction that would, if executed, load 
from an address that is unreadable according to the safety policy. Then, since our Id 
instruction requires that the address must be readable, there will not exist (r', m') such 
that (r, m) i-A {r',m'). 

The mixing of machine semantics and safety policy is to follow the standard practice 
in type theory so that we can get a clean and uniform definition of code safety. For 
instance, we can define that a state is safe if it cannot lead to a stuck state. 

safe(r, to) =\/r',m'. {r,m) i-A* 3r”,m”. {r',m') i-A 

where i-A* denotes zero or more steps. 

^ Our step relation first increments pc, then executes an instruction. Thus, the semantics of Id 
does not include the semantics of incrementing the pc. 
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To show safe(r, m), it suffices to prove that the state is “safe for n steps,” for any 
natural number n. 

safe_n(n, r, m) ='^r',m'.'^j <!n.(r,m) i— > {r" ,m") 

where denotes j steps being taken. 

An assembly-language program C is a list of assembly instructions. For example, 

C = add rl, rl, r2; jmp I 3 ; Id[r2],r3; ... 

We use predicate prog Joaded(m, C) to mean that code C is loaded in memory m: 

progJoaded(m, C) = V 0 < fc < IC]. decode(m(4fc), Cfc) 

where ICj is the length of the list; predicate decode(a;, Ck) means that word x is decoded 
into instruction Ck (the fc-th instruction in C). In this paper, we assume code C is always 
loaded at start address 0 and thus the fc-th instruction will be at address 4fc in the memory 
(Sparc instructions are four bytes long). 

We define fhat an assembly program C is safe if any initial state (r, m) satisfying 
the following is a safe state: code C is loaded inside m; the program counter initially 
points to address 0; when the program begins executing, entry condition ())o^ holds on 
the state (r, m). 

safe_code(C) = \/r,m. (prog_loaded(m, C) A r(pc) = 0 A (r,m) : </>o) => safe(r, m) 



3 Typed Assembly Language 

In this section, we introduce a typed assembly language. We will show here a small 
subset of our actual implementation. Our full language has hundreds of operators and 
rules, as necessary for production-scale safety checking of real software (so it’s a good 
thing that our full soundness proof is machine-checkable). 



3.1 Syntax 

Figure 1 shows our TAL syntax. A TAL program consists of assembly program C and 
type annotation F. Assembly program C is a sequence of assembly instructions, which 
include addition (add), load (Id) and branch always (ba)."^ Type annotation F takes the 
form of a label environment, which summarizes the preconditions of all the labels of the 
program. Our TAL has 32 registers, and labels are divisible by 4. 

^ In our implementation, the initial condition (f)o is simple enough to be described directly in our 
underlying logic so that semantic model of types is not a part of the specification of the safety 
theorem; it is contained entirely within the proof of the theorem. 

Since our sample TAL cannot deal with delay slots, the ba instruction is really a ba followed 
by a nop in Sparc. 
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{program) C 
(instruction) i 
(label env) F 
(register num) r 
(label) I 



i\i;C 

add r, r, r I Id r, r I ba Z 
{I : codeptr (</>)} D . . . 
0| 1| ... |31 
0|4|8| ... 



(types) T ::= int | int=(n) 

I box (t) I codeptr (4>) 
(type env) (p ::= T ^ \ \ {n : t} 

I 01 n (/>2 I 0[n r ] 

(nat) n ::= 0 | 1 | 2 | ... 



Fig. 1. Syntax: Typed assembly language syntax 



Fig. 2. Syntax: Types 



Figure 2 lists the type and type environment constructors. They include integer type 
int and immutable reference type box (r). The language also has singleton type int=(n) 
containing only value n. An address I has type codeptr (0) if it is safe to pass the control 
to address I provided that precondition 0 is met. 

A type environment </> specifies types of slots in a vector, such as a register bank 
or the list of program labels. Any vector satisfies environment T and no vector can 
satisfy J_ 0 . A singleton environment {n : r} means slot n (e.g., register n or label n) 
has type t. Intersection type fl </>2 can be used to type several slots of a vector, e.g. 
{ni : Ti} n {ri 2 ■ T 2 } specifies that slots ni and U 2 have type ti and T 2 , respectively. 

We use 0 C {n : r} to describe that {n : r} is one of the conjuncts in 0. We write 
4>(n) for the type of slot n in (f>. Notation (j)[n r] updates the type of slot n to t, by 
first removing the old entry for n in 0 (if one exists), then intersecting it with {n : r}. 

The type annotation F, or the label environment, specifies the type of each label in 
terms of the code pointer type, so it is also a type environment; we use the same operators 
for T as for 0. 



3.2 Type Checking 

There are three kinds of judgments in our type system: 

- Program judgment \-^ C : F means that assembly program C is wellformed with 
respect to type annotation F. 

- Block judgment F\l hb C : F' means that assembly program C, starting at ad- 
dress I, is wellformed with respect to F', assuming the global label-environment 
F. Environment F provides preconditions of labels to which C might jump; Envi- 
ronment F' is the collection of preconditions of labels inside C and is a part of F. 
Superficially, it seems semantically circular to judge the wellformedness of some 
labels (T') by assuming the wellformedness of all labels (F). However, as indicated 
in the overview section, the model of hb is that from a weaker assumption about F 
(every label inside is safe for k steps), we prove a stronger result about F' (every 
label inside is safe for fc -f 1 steps). 

- Instruction judgment F;l hi {4>i} i {<j> 2 } means that assembly instruction i, at 
address I, is wellformed with respect to precondition and postcondition 02 - As in 
hb, F provides label preconditions. The purpose of having location I in the judgment 
is to be able to compute the destination address for pc-relative jump instructions. 
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Typing rules except for instructions are shown in Figure 3. To check that program 
C is wellformed, the prog rule will call ; 0 Ft C : F, thus recursively call block_i 
and BLOCK_2 rules to check that each basic block in C is wellformed. 

Rule BLOCK_i first looks up precondition (/>i and postcondition 4>2 in F for the current 
block, composed of one instruction i; checks the wellformedness of instruction i with 
respect to (pi and 4>2', then checks the rest of the code with respect to F' . Without loss 
of generality, we assume that each block has exactly one instruction. In rule block_ 2 , 
the postcondition of the last instruction i is J_ 0 , because the control is not allowed to be 
beyond the last instruction (an unconditional branch satisfies fhis posfcondifion). 



T;0 hb G : r 
Fp G : r 



r{l) = codeptr F{1 + 4) = codeptr {<f) 2 ) 

F-l^, r;Z + 4FbG:r 

G; Z Fb i; G : {G codeptr n F' 



G(Z) = codeptr (<()) G;Z Fi {<j>}i{F^} 
F\l Fb i \ {F. codeptr (<())} 



BLOCK_2 



Fig. 3. Syntax: Typing rules (except for instructions) 



Figure 4 shows typing rules for instructions. The rule for instruction add requires 
that the source registers are of type int beforehand and the destination register gets type 
int afterward. The rule for instruction ba needs to look up the type of the destination 
label through F and check the current precondition (p matches the destination one (A 
TAL usually has subtyping rules allowing the current precondition to be stronger than 
the destination one). 



4> C {si : int} 4> C {s 2 : int} 0 C {s : box (t)} 

G; / hi {(/>} add Si, S 2 , d {0[d t int]} F;l hi {</)} Id s, d {</>[d r]} 



r(l + d) = codeptr ((/>) 
r-,l hi {0}bad{±,^} 



Fig. 4. Syntax: Typing rules for instructions 



4 Indexed Model of Types 

In this section, we give a brief description of the indexed model of types, which is intro- 
duced in [9] to model general recursive types. Our induction technique in the overview 
section is also inspired by the intuition behind the indexed model. In the indexed model, a 
type is a set of indexed values { (fc, to, x) }, where /c is a natural number (“approximation” 
index), to is a memory, and x is an integer. 
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The indexed model of the types and type environments are listed below. For example, 
type int=(3) would contain all the {k, m, x) such that x is 3. Memory to is a part of 
a value {k, to, x) because to express that x is of type box (t) we need to say that the 
content in the memory, or m{x), is related to type r. 

int = {(fe, to, x) I true} int=(n) = {(/c, to, x) | x = n} 

box (t) = {(fc, to, x) I X e dom(TO) A Vj < k. readable(x) A {j, to, to(x)) G r} 

codeptr ((})) = {{k,m,x) | Vj, r. j < k A r(pc) = x A {m,r) -.j 4> =i> safe_n(j, r, to)} 

T ^ = {{k,m,x) I true} = {{k, to, x) \ false} 

{n: t} = {(k,m,x) \ (k,m,Xn) G r} 

(f>ir\(f>2 = {{k, TO, x) I (fe, TO, x) e (f>l A {k, TO, x) G </> 2 } 

4)[n i-A r] = {{k, TO, x) I 3y. {k, to, x[n i-A y]) £ (j> A {k, to, x„) G t} 

We use (to, x) r as a syntactic sugar for (fc, to, x) G r. We write (to, x) : r to 

mean (to, x) :fc r is true for any k, or (to, x) is a real member of type t. Now we explain 
the purpose of index k in the model. In general, if (to, x) r, value (to, x) may be a 
real member of type r, or it may be a “fake” member that only fc-approximately belongs 
to T. Any program taking such a “fake” member as an input cannot tell the difference 
within k steps. 

Let type r be box (box (int)). Suppose (to, x) : t, then x is a two-fold pointer and 
to(to(x)) is of type int. However, suppose we only know that (to, x) : box (int), then 
for one step (one dereference), (to, x) safely simulates membership in box (box (int)). 
In this case, we can say (to, x) :i box (box (int)). 

One property of types is that they are closed under decreasing approximations, that 
is, if (to, x) :fc t and j < k, then (to, x) -.j t. 

Another example to understand the approximation index k is the type codeptr {(j>). 
A real member (to, 1) of type codeptr {(f>) means that if condition (j) is met, it is safe to 
jump to location /. Then (to, 1) \k codeptr ((/>) would mean that it is safe to execute k 
steps after jumping to 1. Therefore, the definition of codeptr {(j>) says that for any j and 
r such that j is less than k, if the control is at location I and the current state satisfies <j), 
the state should be safe for j steps. In some sense, this definition only guarantees partial 
safety: safe within k steps. To show that location I is a safe location, we have to prove 
that it belongs to codeptr {(j>) under any k. 

Sometimes we need to judge not only scalar values such as {k, to, x) but also vector 
values {k, to, x) (a vector is a function from numbers to values). One use is to write 
(to, r) :k which means that the contents of machine registers satisfy (j). In this case, 
X is the register bank r. Another use of vector types is the label environment T, which 
summarizes the types of all program labels. In this case, x would be the identity vector 
id (map I to /). For example, (to, id) : {I : codeptr (</>)} means that address I itself has 
type codeptr 

With the semantic model of types, all the subtyping rules and introduc- 
tion/elimination rules of types (not shown in the paper) can be proved as lemmas [8,9]. 
In particular, the codeptr elimination rule cptr_e is useful for the proof of Theorem 2 
in Section 5.3. 

(m,x) :fc+i codeptr (0) r{pc) = x (m,r) :k 4> 



safe_n(fc, r, m) 



CPTR_E 
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5 Semantic Models of Typing Judgments 

In this section, we develop models for typing judgments based on the induction technique 
in Section 1 .2. From their models, each of the typing rules is proved as a derived lemma. 
Finally, the type-safety theorem is also proved as a derived lemma. 

5.1 Subtype Induction 

Our goal is to provide a proof that code C obeys our safety policy, or a proof of 
safe_code(C'), which means that any state containing the code C is safe arbitrarily 
many steps under condition (f>Q — exactly what the type codeptr denotes. Thus, our 
goal is formalized as (m, 0) : codeptr (</>o), for any m containing code C. 

As outlined in the overview section, we will prove a stronger result instead: all labels 
are of code-pointer types under corresponding preconditions. Formally, for any label I 
in the domain of label environment T, we will prove {m, 1) : r{l)-, another way to state 
this is (m, id) : T. 

A condition on m is that it must contain the code C. This condition can also be 
formalized as types. After all, in our model, types are predicates over states and can be 
used to specify invariants of states. Type instr (i) expresses that instruction i is in the 
memory. 

instr (i) = {(fc, TO, x) I decode(m(a;), z)} 

Type environment constructor A turns a sequence of assembly instructions into a type 
environment that describes the code. 

Z\(zo; zi; . . .) = {0 : instr (zq)} H {4 : instr (zi)} n . . . 

With constructor A, judgment (to, id) : A{C) formally states that memory to con- 
tains code C. For such a value (to, id), we want to show it also satisfies F, or (to, id) : F. 
Now if we define 

A{C) C F = Vfc, TO. (to, id) -.k 4\(C) (to, id) \k r 

then A(C) C F expresses that any state containing code C respects invariants F under 
any approximation k. 

We explore how to prove A{C) C F. Assuming (to, id) :k 2\(C), we have to show 
(to, id) F. We will prove it by induction over k. When k is zero, judgment (to, /) :o 
F{1) is trivially true since F{1) is a code pointer type, which is always true at index zero 
by its definition. The inductive case is that, assuming (to, id) A{C) (to, id) :fc F, 
we have to show that (to, id) :k+i A(C) => (to, id) :k+i F. Since code environment 
Z\(C) ignores the index, (to, id) -.k+i 2\(C) isequivalentto (to, id) \k 2\(C'). Therefore, 
to prove (to, id) '.k+i r, we have both (to, id) :fc A{C) and (to, id) :fc F. 

Our intention is to give models to the typing judgments in our TAL based on this 
proof technique. To make the models concise, we abstract away from the indexes by 
defining a subfype-plus predicafe Fi (S I 2 fo simulate the inductive case. 



Fi ® F2 =yk,m. (to, id) :fc A ^ (to, id) :k+i A 
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With the subtype-plus operator, the inductive case to prove Z\(C') C F can be written 
as A{C) r ^ r \ assuming code C is in the memory under index k and F is true 
under index k, prove that F is true under index fc -I- 1. The following subtype induction 
theorem formalizes what we have explained. 



Theorem 1. (Subtype Induction) 



A{c) n r ® r 
A{C) c F 



5.2 Semantic Model of Typing Judgments 

At the heart of our semantic model is a set of concise definitions for the typing judgments 
based on the abstractions (especially €.) we have developed. We hereby exhibit such 
definitions: 

hp C : r = A{C) C F 

T; I hb C : r = offset; (A(C')) H F ^ F' 

F;l \-± * {4>2) = {I ’■ instr (i)} n T n {f + 4 : codeptr {4>2)} € {I : codeptr 

where offset;({0 : ti} (T {4 : T 4 } (T . . .) = {F. n} fl {( + 4 : T 4 } (T . . .. 

The model of hp C : F means that any state having the code inside respects 
invariants F. 

The judgment F;l hb C : F' judges the validity of F' by assuming F. Since F' 
is the collection of preconditions of labels inside C and is a part of F, the judgment 
itself has a superficial semantic circularity. We solve this circularity by giving it a model 
based on operator ®. By assuming F to approximation k, we prove F' to approximation 
k + 1. Also, since code C starts at address I, we need to use offset; (Z\(C)) to make a 
code environment that starts at address 1. 

The semantics of F;l hi {4>\\ i {4>2} follows the same principle as the one for the 
model of hp. We assume that instruction i is at location I and F holds to approximation 
k, we prove location I is a code pointer to approximation fc + 1. In the model of hi, 
we have an extra assumption {/ + 4 : codeptr {(j> 2 )}- In our sample TAL, since every 
basic block has only one instruction, every address would have a precondition in F. 
Therefore, {I + 4 : codeptr (</> 2 )} is a part of F and need not have been specified as a 
separate conjunct. However, in the case of multi-instruction basic blocks, F would only 
have preconditions for each basic block, <j )2 in this case would be reconstructed by the 
type system and not available in F. 



5.3 Semantic Proofs of Typing Rules 

Using these models, we can prove both the type-safety theorem and the typing rules in 
Fig.3. 



hp C : U F C {0 : codeptr ((/)o)} 
safe_code(C) 



Theorem 2. (Type Safety) 
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Proof. From the definition of safe_code(C'), we have the following assumptions for 
state (r, m) and want to show that safe(r, m). 



i) prog_loaded(C, m) ii) r{pc) = 0 Hi) (m, r) : 4>o 



On the other hand, the model of hp C : F is A{C) C F. The deduction steps from 
A{C) C T to safe(r, to) are summarized by the following proof tree. 



progJoaded(C, m) 
{m, id) : A{C) 



(7) 



A{C) C r r C {0 : codeptr (<i>o)} 



(m, id) : {0 : codeptr (</>o)} 
(m,0) : codeptr (</)o) 

Vfc. (m,0) '.k codeptr (0o) 
Mk. (m,0) :k+i codeptr (</)o) 



(5) 

(4) 

(3a) 



(6) 



(m, r) : </)o 
Vfc. (m,r) -.k 4>o 



(36) 



r{pc) = 0 



Vfc. safe_n(fc, r, m) 
safe)^, m) 



(2) 



( 1 ) 



Step (1) says to prove (r, to) is safe, it suffices to prove that (r, to) is safe for an 
arbitrary k steps. Step (2) is justified by rule cptr_e in Section 4. Step (3a) is by universal 
instantiation. Step (4) isjust the unfolding of the syntactic sugar of (to, 0) : codeptr (0o)- 
Step (5) is by the definition of the singleton environment. Step (6) is by the transitivity 
of subtyping. Step (7) can be easily proved by unfolding definitions. □ 



Theorem 3. 



F;0H C:F 

Fp 



PROG 



Proof. From the models ofhp and ht, we have to prove Z\(C) C TfromZ\(C')nT (T T 
— exactly the subtype induction theorem (Theorem 1). □ 



Theorem 4. 



F{1) = codeptr (i^i) F{1 + 4) = codeptr 

F-IP, T;( + 4UC:r 

F',l hb i; C : {Z : codeptr (</ii)} fl F' 



Proof. The models of F;l hi {4>i\ i {<(' 2 }^ and F;l + 4 hb C : F' gives us that 
{/ : instr (i)} fl T {F. codeptr (0i)} ofFset;+4(Z\(C')) (1 F ^ F' 

The goal ofFset;(Z\(z; C)) fl T € {I : codeptr (^ii)} fl F' is proved by the following 
lemmas: 

F d Fl F d F 2 

F ^ FiD F 2 oFFset/(Z\(z; C)) = {I : instr (z)} fl oFFseti_|_4(Z\(C)) 

□ 

The proof of rule block_ 2 is similar. 

^ The model of hi has another clause {Z + 4 : codeptr (<(> 2 )} on the left of d. However, since 
r(l + 4) = codeptr (<(' 2 ), we can prove that F n {Z + 4 : codeptr (<(> 2 )} = F. 
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5.4 Semantic Proofs of Machine Instructions 

What remains is the proofs of typing rules for instructions (Fig.4). We will show the 
technique by informally proving the load rule. 

(p C {s : box (r)} 
r;l\~i {(/)} Id s, c? I— >■ t]} 

The precondition states that register s is a pointer to type r; the postcondition is that 
register d gets type r and types of other registers remain the same. This typing rule is 
intuitively “correct” since operationally Id s, d loads the content at address r(s) in the 
memory into register d. The semantic model of hi tells us what the “correctness” of the 
rule LOAD means: 

{/ : instr (id s, d)} hi 7” hi {I + 4 : codeptr {(p')} <$ {I : codeptr {(p)} 

where cp' = <p\d i— >■ t]. That is, forall k and m, we should prove (m, 1) : k+i codeptr {(p), 
or location I is safe within k+1 steps. We can assume (1) there is an instruction Id s, c? 
at address I in m; (2) all the labels in F are code pointers to approximation fc; (3) label 
/ + 4 is of type codeptr ((/>') to approximation k. 

By the definition of (m, /) :fc+i codeptr {(p), we start at (r, m) with condition <p met 
and control at 1. By the semantics of Id s, d, if the location (location r(s)) to read is 
readable, we can find a succeeding state (r', m') such that (r, to) i— >■ (r', to'). Not by 
coincidence, rule load has a premise that (pC{s : box (t)}; together with that <p is met 
on state (r, to), readable(r(s)) can be shown. 

Therefore, there is a state (r', to') that (r, to) can step to because of the execution 
of the instruction Id s,d. By the semantics of Id, condition <p' can be proved to hold on 
state (r', to') and the control of (r', to') is at label I + 4. Because label ^ + 4 is of type 
codeptr ((/>') to approximation k, state (r', to') is safe within k steps. Taking the step by 
Id into account, the first state (r, to) is safe within k+1 steps. 

Proofs for other typing rules of instructions follow the same scheme. In the case of 
control-transfer instructions like ba, the assumption (2) about F guarantees that it is safe 
to jump to the destination address. 

6 Implementation 

Our work is a part of the Foundational Proof-Carrying Code project [12], which includes 
a compiler from ML to Sparc, a typed assembly language called LTAL [6], and semantic 
proofs of typing rules. We have successfully given models to typing judgments in LTAL 
based on proof techniques in this paper, with a proved type-safety theorem and nearly 
complete semantic proofs of the typing rules. All the proofs are written and machine- 
checked in a theorem-proving system — Twelf [13]: 1949 lines of axioms of the logic, 
arithmetic, and the specification of the SPARC machine; 23562 lines of lemmas of logic 
and arithmetic, theories of mathematical foundations such as sets and lists; 72036 lines 
of lemmas about conventions of machine states and semantic model of types ; 18895 lines 
of lemmas about machines instructions and the LTAL calculus. Most of the incomplete 
semantic proofs are about machines instruction semantics. 
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To focus the presentation on the essential ideas, we have not shown many features 
of our actual implementation. In this paper we have used immutable reference types 
to describe data structures in the memory and proved that programs can safely access 
these data structures. Our implementation can also deal with mutable references [10] so 
that programs can safely update data structures in the memory. Allocation of new data 
structures in the memory have also been taken into account by following the allocation 
model of SML/NJ. However, we do not currently support either explicit deallocation 
(free) or garbage collection. LTAL is also more expressive, including type variables, 
quantihed types and condition-code types. It can also type-check position-independent 
code and multi-instruction basic blocks. Our semantic model supports all these features. 



7 Related Work 

There has been some work in the program-verihcation community to use a semantic 
approach to prove Hoare-logic rules as lemmas in an underlying logic [7,14,15]. Such 
proofs have been mechanized in HOL [14]. These works use first-order or higher-order 
logic to specify the invariants and have the difficulty that loop invariants cannot be 
derived automatically, so the approach does not scale to large programs. 

Hamid et al. [16] and Crary [17] use a syntactic approach to prove type soundness. 
The hrst stage of their approach develops a typed assembly language, which is also 
given an operational semantics on an abstract machine. Then syntactic type-soundness 
theorems are proved on this abstract machine following the scheme presented by Wright 
and Felleisen [18]. The second stage uses a simulation relation between the abstract 
machine and the concrete architecture. The syntactic approach does not need the building 
of denotational semantics for complicated types such as recursive types, and it can 
also have machine-checkable proofs. However, the simulation step between the abstract 
machine and a full-fledged architecture is not a trivial task. 

In some sense, the problem we solve in this paper is to give models to unstructured 
programs with goto statements and labels. There has been work by de Bruin [19] to give 
goto statements a domain-theoretic model. His approach to prove that code respects 
invariants is by approximations over code behavior, i.e. any k-ih approximations of 
code behavior respects invariants. In our approach, we use types as invariants and do 
approximations over types, i.e. code respects any fc-th approximations of types. 

In conclusion, we have shown how to build end-to-end foundational safety proofs of 
programs on a real machine. We have constructed a semantic model for typing judgments 
of a typed assembly language and given proofs for both the type-safety theorem and 
typing rules. Our approach allows a typing derivation to be interpreted as a machine- 
checkable safety proof at the machine level. 
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Abstract. We present a rule-based framework for defining and implementing 
finite trace monitoring logics, including future and past time temporal logic, ex- 
tended regular expressions, real-time logics, interval logics, forms of quantified 
temporal logics, and so on. Our logic. Eagle, is implemented as a Java library 
and involves novel techniques for rule definition, manipulation and execution. 
Monitoring is done on a state-by-state basis, without storing the execution trace. 



1 Introduction 

Runtime verification, or runtime monitoring, comprises having a software module, an 
observer, monitor the execution of a program and check its conformity with a requirement 
specification, often written in a temporal logic or as a state machine. Runtime verification 
can be applied to automatically evaluate test runs, either on-line, or off-line analyzing 
stored execution traces; or it can be used on-line during operation, potentially steering the 
application back to a safety region if a property is violated. It is highly scalable. Several 
runtime verification systems have been developed, of which some were presented at 
three recent international workshops on runtime verification [1]. 

Linear temporal logic (LTL) [19] has been core to several of these attempts. The 
commercial tool Temporal Rover (TR) [6,7] supports a fixed future and past time LTL, 
with the possibility of specifying real-time and data constraints (time-series) as anno- 
tations on the temporal operators. Its implementation is based on alternating automata. 
Algorithms using alternating automata to monitor LTL properties are also proposed in 
[9], and a specialized LTL collecting statistics along the execution trace is described in 
[8]. The MAC logic [18] is a form of past time LTL with operators inspired by interval 
logics and which models real-time via explicit clock variables. A logic based on extended 
regular expressions [20] has also been proposed and is argued to be more succinct for 
certain properties. The logic described in [16] is a sophisticated interval logic, argued 
to be more user-friendly than plain LTL. Our own previous work includes the devel- 
opment of several algorithms, such as generating dynamic programming algorithms for 
past time logic [14], using a rewriting system for monitoring future time logic [12,13], 
or generating Biichi automata inspired algorithms adapted to finite trace LTL [11]. 

* This author is most grateful to RIACS/USRA and to the UK’s EPSRC under grant 
GR/S40435/01 for the partial support provided to conduct this research whilst at NASA Ames 
Research Center. 

** This author is grateful for the support received from RIACS to undertake this research while 
participating in the Summer Student Research Program at the NASA Ames Research Center. 
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This large variety of logics prompted us to search for a small and general framework 
for defining monitoring logics, which would be powerful enough to capture essentially 
all of the above described logics, hence supporting future and past time logics, interval 
logics, extended regular expressions, state machines, real-time and data constraints, 
and statistics. The framework should support the definition of new logics in an easy 
manner and should support the monitoring of programs with their complex program 
states. The result of our search is the logic Eagle presented in this paper. The Eagle 
logic and its implementation for runtime monitoring has been significantly influenced 
by earlier work on the executable, trace generating as well as trace checking, temporal 
logic MetateM [3]. In MetateM a linear-time temporal formula is separated [10] into 
a boolean combination of pure past, present and pure future time formulas. Conditioned 
by the past, the present-time, or state, formulas determine how the state for the current 
moment in time is built and the pure future time formulas yield obligations that need to be 
fulfilled at some time later. The separation result, rules and future obligations are central 
in our current work. However, the fundamental difference between MetateM and Eagle 
is that the MetateM interpreter builds traces state by state, whereas Eagle is used for 
checking given finite traces: costly implementation features, such as backtracking and 
loop-checking, are not required. 

We recently discovered parallel work [17] using recursive equations to implement 
a real-time logic. However we had already developed the ideas further. We provide the 
language of recursive equations to the user, we support a mixture of future time and past 
time operators, we treat real-time as a special case of data values, and hence we allow 
a very general logic for reasoning about data, including the possibility of relating data 
values across the execution trace, both forwards and backwards. 

This paper is structured as follows. Section 2 introduces our logic framework. In 
section 3 we discuss the algorithm and calculus that underlies our implementation, 
which is then briefly described along with initial experimentation in section 4. Further 
papers on Eagle are available covering material that couldn’t be covered in this paper. In 
[4], we illustrate that when Eagle is specialized to a propositional LTL our monitoring 
algorithm is space efficient with an upper bound of 0{rn? log m2’”), where m is the 
size of the monitored formula; in [5], we present full details of our current Java-based 
monitoring algorithm and its associated rewrite calculus. 

2 The Logic 

In this section we introduce our temporal finite trace monitoring logic Eagle. The 
logic offers a succinct but powerful set of primitives, essentially supporting recursive 
parameterized equations, with a minimal/maximal fix-point semantics together with 
three temporal operators: next-time, previous-time, and concatenation. The next-time 
and previous-time operators can be used for defining future time respectively past time 
temporal logics on top of Eagle. The concatenation operator can be used to define 
interval logics and an extended regular expression language. Rules can be parameterized 
with formulas, and with data to allow for the expression of data constraints, including 
real-time constraints. Atomic propositions are boolean expressions over a program state, 
Java states in the current implementation. The logic is first introduced informally through 
two examples whereafter its syntax and semantics is given. Finally, its relationship to 
some other important logics is outlined. 
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2.1 Eagle by Example 

Fundamental Concepts. Assume we want to state a property about a program P, which 
contains the declaration of two integer variables x and y. We want to state that whenever 
X is positive then eventually y becomes positive. The property can be written as follows 
in classical future time LTL: n(a; > 0 — >■ Oy > 0). The formulas OF (always F) and 
(}F (eventually F), for some property F, usually satisfy the following equivalences, 
where the temporal operator Q)F stands for next F (meaning ‘in next state F’): 

OF = F A0{0F) <}F = FV0{<}F) 

One can for example show that □T’ is a solution to the recursive equivalence X = 
F A Q)X-, in fact it is the maximal solution^ A fundamental idea in our logic is to 
support this kind of recursive dehnition, and to enable users dehne their own temporal 
combinators using equations similar to those above. In the current framework one can 
write the following definitions for the two combinators Always and Eventually, and 
the formula to be monitored (Mi): 

max Alwaysf Form F) = F A OAlways(F) 

min Eventuallyf Form F) = F V OE'vsntually(F) 

mon Ml = Always (a; >0-4 Eventually(y > 0)) 

The Always operator is defined as having a maximal fix-point interpretation; the 
Eventually operator is dehned as having a minimal interpretation. Maximal rules 
define safety properties (nothing bad ever happens), while minimal rules define liveness 
properties (something good eventually happens). For us, the difference only becomes 
important when evaluating formulas at the boundaries of a trace. To understand how this 
works it suffices to say here that monitored rules evolve as new states are appearing. 
Assume that the end of the trace has been reached (we are beyond the last state) and a 
monitored formula F has evolved to F'. Then all applications in F' of maximal hx-point 
rules will evaluate to true, since they represent safety properties that apparently have 
been satished throughout the trace, while applications of minimal hx-point rules will 
evaluate to false, indicating that some event did not happen. Assume for example that we 
evaluate the formula Mi in a state where x > 0 and y < 0, then as a liveness obligation 
for the future we will have the expression: 

Eventually(y > 0) A Always(x > 0 — > Eventually(y > 0)) 

Assume that we at this point detect the end of the trace; that is: we are beyond the 
last state. The outstanding liveness obligation Eventually(y > 0) has not yet been 
fulhlled, which is an error. This is captured by the evaluation of the minimal hx-point 
combinator Eventually being false at this point. The remaining other obligation from 
the A-formula, namely, Always(x > 0 — s- Eventually(y > 0)), is a safety property 
and evaluates to true. 

For completeness we provide remaining dehnitions of the future time LTL operators 
14 (until) and W (unless) below. Note how W is dehned in terms of other operators. 
However, it could have been dehned recursively. 

min Until ( Form Fi , Form F 2 ) = F 2 V (Fi A OUntil(Fi, F 2 )) 
max Unless ( Form Fi , Form F 2 ) = Until(Fi, F 2 ) V Always(Fi) 

* Similarly, (}F is a minimal solution to the equivalence X = F V 
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Data Parameters. We have seen how rules can be parameterized with formulas. Let 
us modify the above example to include data parameters. Suppose we want to state the 
property: “whenever at some point x = k > Ofor some k, then eventually y = k” . This 
can be expressed as follows in quantified LTL: □(x > 0 — >■ 3k. {x = k A (}y = k)). We 
use a parameterized rule to state this property, capturing the value of x when a; > 0 as 
a rule parameter. 

min R(int k) = Eventually(?/ = k) mon M 2 = Always(a; > 0 — >■ R(a:)) 

Rule R is parameterized with an integer k, and is instantiated in M 2 when a; > 0, 
hence capturing the value of x at that moment. Rule R replaces the existential quantifier. 
The logic also provides a previous-time operator, which allows us to define past time 
operators; the data parametrization works uniformly for rules over past as well as future, 
which is non-trivial to achieve since the implementation does not store the trace, see 
Section 4. Data parametrization is also used to elegantly model real-time logics. 



2.2 Syntax and Semantics 

Syntax. A specification S consists of a declaration part D and an observer part O. D 
consists of zero or more rule definitions R, and O consists of zero or more monitor 
definitions M, which specify what to be monitored. Rules and monitors are named (N). 

S ::=DO 
D R* 

O ::= M* 

R ::= {n^ I min} A^(Ti xi, . . . , T„ a;„) = T" 

M ::= mon N = F 
T ::= Form | primitive type 

F ::= expression \ true | false | ^F | F’l A F 2 | Fi V F 2 | Fi — >• F 2 | 
0F\QF\F^-F2\N{F^,... ,F^)\x, 



A rule definition R is preceded by a keyword indicating whether the interpretation is 
maximal or minimal (which we recall determines the value of a rule application at the 
boundaries of the trace). Parameters are typed, and can either be a formula of type 
Form , or of a primitive type, such as iih, long, float , etc.. The body of a rule/monitor 
is a boolean valued formula of the syntactic category Form (with meta-variables F, 
etc.). Any recursive call on a rule must be strictly guarded by a temporal operator. 
The propositions of this logic are boolean expressions over an observer state. Formulas 
are composed using standard propositional logic operators together with a next-state 
operator (Q)F), a previous-state operator (0 F), and a concatenation-operator (Fi ■ F 2 ). 
Finally, rules can be applied and their arguments must be type correct. That is, an 
argument of type Form can be any formula, with the restriction that if the argument is 
an expression, it must be of boolean type. An argument of a primitive type must be an 
expression of that type. Arguments can be referred to within the rule body (xi). 

In what follows, a rule N of the form 



i max | min j A ( Form /i, . . . . Form pi, . . . ,T„ p„) = B, 
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where /i , • . . /m are arguments of type Form and pi , . . . are arguments of primitive 
type, is written in short as 

I max i min i Af (Form f,Tp) = B 

where / and p represent tuples of type Form and T respectively. Without loss of gener- 
ality, in the above rule we assume that all the arguments of type Form appear first. 



Semantics. The semantics of the logic is defined in terms of a satisfaction relation |= 
between execution traces and specifications. An execution trace cr is a finite sequence 
of program states a = S 1 S 2 ■ ■ ■ Sn, where \a\ = n is the length of the trace. The i’th 
state Si of a trace cr is denoted by a{i). The term crl*’-^! denotes the sub-trace of a from 
position i to position j, both positions included; if i > j then denotes the empty 
trace. In the implementation a state is a user defined Java object that is updated through 
a user provided updateOnEvent method for each new event generated by the program. 
Given a trace cr and a specification D O, satisfaction is defined as follows: 



a \= D O iff V (mon N = F) G O . a,l \=d F 



That is, a trace satisfies a specification if the trace, observed from position 1 (the first 
state), satisfies each monitored formula. The definition of the satisfaction relation \=o 
C ( Trace x nat) x Form , for a set of rule definitions D, is presented below, where 
0 < i < n + I for some trace cr = S1S2 • ■ • Sn- Note that the position of a trace can 
become 0 (before the first state) when going backwards, and can become n + 1 (after 
the last state) when going forwards, both cases causing rule applications to evaluate to 
either true if maximal or false if minimal, without considering the body of the rules at 
that point. 



a, i 
a, i 
a, i 
a, i 
a, i 
a, i 
a, i 
a, i 
a, i 
a, i 



1=0 expression 
1=0 true 
^o false 
1=0 

1=0 Fi A F 2 
1=0 Fi V F 2 
1=0 Fi —i F 2 

1=0 OF 
\=dQF 
1=0 Fi ■ F 2 



a,i 1=0 N{F,P) 



iff 1 < i < |cr| and evaluate{expression){a{i)) == true 



iff 

iff 

iff 

iff 

iff 

iff 

iff 



iff 



a,i ^o F 

cr, i 1=0 Fi and a, i |=o F 2 
cr, i 1=0 Fi or cr, i \=o F 2 
a,i 1=0 Fi implies a,i |=o F 2 
i < \a\ and a,i + 1 |=o F 
1 < i and a,i — 1 \=b F 

3_) s.t. i < j < |cr| -f 1 and crh’-’^^l ,^1=0^1 and crb’lo'l]^ \ 
’ if 1 < i < |cr| then: 

cr, i 1=0 B[f F,p evaluate{P){a(i))] 

' where (A~( Form f,Tp)=B) G D 
otherwise, if i = 0 or i = |fr| -f 1 then: 
rule N is defined as max in D 



An expression (a proposition) is evaluated in the current state in case the position i is 
within the trace (1 < i < n). In the boundary cases (z = 0 and i = n+l) a proposition 
evaluates to false. Propositional operators have their standard semantics in all positions. 
A next-time formula Q)F evaluates to true if the current position is not beyond the 
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last state and F holds in the next position. Dually for the previous-time formula. The 
concatenation formula F\ ■ F 2 is true if the trace cr can be split into two sub-traces 
cr = cti(T 2 , such that F\ is true on cti, observed from the current position i, and F 2 is true 
on (72 (ignoring a\, and thereby limiting the scope of past time operators). Applying a 
rule within the trace (positions 1 . . . n) consists of replacing the call with the right-hand 
side of the definition, substituting arguments for formal parameters; if an argument is of 
primitive type its evaluation in the current state is substituted for the associated formal 
parameter of the rule, thereby capturing a desired freeze variable semantics. At the 
boundaries (0 and n + 1) a rule application evaluates to true if and only if it is maximal. 



2.3 Relationship to Other Logics 

The logical system defined above is expressively rich; indeed, any linear-time tempo- 
ral logic, whose temporal modalities can be recursively defined over the next, past or 
concatenation modalities, can be embedded within it. Furthermore, since in effect we 
have a limited form of quantification over possibly infinite data sets, and concatenation, 
we are strictly more expressive than, say, a linear temporal fixed point logic (over next 
and previous). A formal characterization of the logic is beyond the scope of this paper, 
however, we demonstrate the logic’s utility and expressiveness through examples. 



Past Time LTL: A past time linear temporal logic, i.e. one whose temporal modalities 
only look to the past, could be defined in the mirror way to the future time logic exempli- 
fied in the introduction by using the built-in previous modality, 0, in place of the future 
next time modality, Q- Here, however, we present the definitions in a more hierarchic 
(and logical) fashion. Note that the Zince rule defines the past time correspondent to 
the future time unless, or weak until, modality, i.e. it is a weak version of Since. 

min Sincef Form Fi,Fonn F 2 ) = F 2 V {Fi A 0 Since(Fi, F 2 )) 

min Eventually InPast (Form F) = Sincef true . F) 

max Always InPastf Form F) = -lEventually InPast (-iF) 

max Zince( Form F . Form F-?) = Since{Fi, F 2 ) V AlwayslnPast(Fi) 



Combined Future and Past Time LTL: By combining the definitions for the future 
and past time LTLs defined above, we obtain a temporal logic over the future, present 
and past, in which one can freely intermix the future and past time modalities (to any 
depth)^. We are thus able to express constraints such as if ever the variable x exceeds 0, 
there was an earlier moment when the variable y was 4 and then remains with that value 
until it is increased sometime later, possibly after the moment when x exceeds 0. 

mon M 2 = Always(a; > 0 — ^ EventuallylnPast(?/ = 4 A Until(j/ = 4, y > 4))) 



Extended LTL and /xTL: The ability to define temporal modalities recursively provides 
the ability to define Wolper’s ETL or the semantically equivalent fixpoint temporal 

^ See [4] for the correctness argument for such an embedding of propositional LTL in Eagle. 
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calculus. Such expressiveness is required to capture regular properties such as temporal 
formula F is required to he true on every even moment of time: 

max Evenf Form F) = F A Q) Q) Even(F) 

The /rT L formula i'x.{p A Q Q x A A Qx) V O y) ) , where p and q are atomic 

formulas, would he denoted by the formula, X(), where rules X and Y are: 

max X() = p A O O X() A Y() min Y() = (g A QxO) V © Y() 



Extended Regular Expressions: The language of Extended Regular Expressions 
(ERE), i.e. adding complementation to regular expressions, has been proposed as a 
powerful formalism for runtime monitoring. EREs can straightforwardly be embedded 
within our rule-based system. Given, E ::= il)\e\a\E ■ E\E + E\E (1 El-'ElE*, let 
Tr(i?) denote the ERE E’s corresponding Eagle formula. For convenience, we dehne 
the rule max Empty () = -i O true which is true only when evaluated on an empty (suffix) 
sequence. Tr is inductively dehned as follows. 

Tr( 0 ) = false Tr(e) = Empty() 

Tr(a) =aAOEmpty() Tr{Ei ■ E2) = Tr(F;i) • Tr(i?2) 

Tr(i?i -f E2) = Tr(F/i) V Tr(i? 2 ) Tr(F/i n E2) = Tr(F/i) A Tr(F/ 2 ) 

Trl^E) = XTr{E) 

Tr(£^*) = X() where max X() = EmptvO V (Tr(i?)-X()) 



Real Time as a Special Case of Data Binding: Metric temporal logics, in which 
temporal modalities are parameterized by some underlying real-time clock(s), can be 
straightforwardly embedded into our system through rule parameterization. For example, 
consider the metric temporal modality, ’*=^1 in a system with just one global clock. An 
absolute interpretation of has the formula true if and only if F holds at some 

time in the future when the real-time clock has a value within the interval [^1,^2]- For 
our purposes, we assume that the states being monitored are time-stamped and that the 
variable clock holds the value of the real-time clock for the associated state. The rule 

min EventAbsf Form F. float , float F) = 

{F A ti < clock A clock < ^2) V 

{{clock < V {-'F A clock < ^2)) A QEventAbsjF, G, ^2)) 

defines the operator for absolute values of the clock. The rule will succeed when 

the formula F evaluates to true and clock is within the specified interval [^1,^2]- If 
either the formula F doesn’t hold and the time-stamp is within the upper time bound, 
or the lower bound hasn’t been reached, then the rule is applied to the next input state. 
Note that the rule will fail as soon as either the time-stamp is beyond the given interval, 
i.e. clock > t2, and the formula F has not been satisfied, or the end of the input trace 
has been passed. A relativized version of the modality can then be defined as: 



min EventRelf Form F, float G, flq^ ^2) = EventAbs(F, clock + t\, clock + ^2) 
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Counting and Statistical Calculations: In a monitoring context, one may wish to 
gather statistics on the truth of some property, for example whether a particular state 
property F holds with at least some probability p over a given sequence, i.e. it doesn’t 
fail with probability greater than (1 — p). Consider the operator DpF defined by: 

(T,i \= n„F iff C {z..|(t|} s.t. — — > P /\'^j & S . a,j \= F 

\a\-i 

An encoding within our logic can then be given as: 

min A ( Form F, float p, int /, int f) = 

(OEmptyO A {{F A (1 - |) >= p) V (-.F A (1 - >= p))) V 

(^EmptyO A {{F -)> Qk{F,p, f,t+ 1)) A {^F -)> QA(F,p, / + 1, f + 1)))) 

min AtLeast( Form F, float p) = k{F,p, 0, 1) 

The auxiliary rule A counts the number of failures of F in its argument / and the number 
of events monitored in argument t. Thus, at the end of monitoring, the first line of A’s 
body determines whether F has held with the desired probability, i.e. > p. AtLeast 
therefore calls A with arguments / and t initialized to 0 and 1 respectively. 



Towards Context Free: Above we showed that Eagle could encode logics such as 
ETL, which extend LTL with regular grammars (when restricted to finite traces), or 
even extended regular expressions. In fact, we can go beyond regularity into the world 
of context-free languages, necessary, for example, to express properties such as every 
login is matched by a logout and at no point are there more logouts than logins. Indeed, 
such a property can be expressed in several ways in Eagle. Assume we are monitor- 
ing a sequence of login and logout events, characterized, respectively, by the formulas 
login and logout. We can define a rule Match( Eorm Fi,Fqnn F 2 ) and monitor with 
na.tch{login, logout) where: 

min Matchf Form F^ , Form F 9 ) = F^ ■ MatchfFi . Fo) ■ F? ■ MatchfFi . Fo) V Empty)) 

Less elegantly, and which we leave as an exercise, one could use the rule parametrization 
mechanism to count the numbers of logins and logouts. 



3 Algorithm 

In this section, we briefly outline the computation mechanism used to determine whether 
a given monitoring formula holds for some given input sequence of events. For details 
on the algorithm, the interested readers can refer to [5]. In the algorithm, we assume 
that a local state is maintained on the observer side. The expressions or propositions 
are specified with respect to the variables in this local state. At every event the observer 
modifies the local state of the observer, based on that event, and then evaluates the 
monitored formulas on the new state, and generates a new set of monitored formulas. 
At the end of the trace the values of the monitored formulas are determined. If the value 
of a formula is true, the formula is satisfied, otherwise the formula is violated. 
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First, a monitor formula F is transformed to another formula F' . This transforma- 
tion addresses the semantics of Eagle at the beginning of a trace. Next, the transformed 
formula is monitored against an execution trace hy repeated application of eval. The 
evaluation of a formula F on a state s = a{i) in a trace cr results in an another formula 
eval{{F, s)) with the property that cr, z ^ F if and only if cr, z + 1 \= eval{{F, s)). The defi- 
nition of the function eval : Form x State — Form uses an auxiliary function update with 
signature update : Form x State — Form , update’s, role is to pre-evaluate a formula if it is 
guarded by the previous operator 0. Formally, update has the property that a^i\= Q)F 
iff cr, z-|- 1 \= updatellF, s)) . Had there been no past time modality in Eagle update would 
be unnecessary and the identity a,i \= Q)F iff cr, z -f 1 \= F could have been used. At 
the end (or at the beginning) of a trace, the function value : Form — ( true , false ) when 
applied on F returns true iff ct, |cr| -f 1 )= F (or ct, 0 \= F) and returns false otherwise. 
Thus given a sequence of states si §2 • ■ • Sn, an Eagle formula F is said to be satisfied by 
fhe sequence of stafes if and only if va/zze((cva/((. . . eval {{eval {{F' , si} , S 2 )) ■ ■ ■ ,s„))))is 
frue . The functions eval, update and value are the basis of the calculus for our rule-based 
framework. 



3.1 Calculus 

The eval, update and value functions are defined a priori for all operators except for the 
rule application. The definitions of eval, update and value for rules get generated based 
on the definition of rules in the specification. The definitions of eval, update and value 
on the different primitive operators are given below. 

evaZ((tme, s)) = true vaZue({true)) = true 

evaZ (( false , s)) = false vaZue ((false)) = false 

eval{(jexp, s)) = value of jexp in s value^exp'j) = false 

eval((Fi op F2, s)) = eval((Fi, s)) op eval((F2, s)) value((Fi op F2)) = value{{Fi)) op value((F2)) 

evaU[—'F,s'f) = —'evaUlF,s)) valuel^—’F'^ = —'valuellF'^ 

evalllQF, s]) = update ([F, s]) value([QF)) 

eval((Fi ■ F2, s)) = _ J F if at the beginning of trace 

_ J evalllFi, s)) ■ F2 if value ([Fi]) = false ) false if at the end of trace 

) (evaZ((Fi, s)) • F2) V eval{{F2, s)) otherwise valuel[Fi ■ F2)) = value{{Fi'f) A value([F2'f) 



update (j tme , s)) = true 
update {{ false. , s)) = false 
updatelljexp, s)) = jexp 

update^F\ op F2, s)) = update{{Fi, s)) op update{{F2, s)) 
update{{~,F, s)) = -^update^F, s)) 
update^Fi ■ F2, s)} — update^Fi, s}) ■ F2 

In the above definitions, op can be A , V , — In most of the definitions we simply propagate 
the function to the subformulas. However, the concatenation operator is handled in a 
special way. The eval of a formula Fi • F 2 on a state s first checks if value{{Fi)) is true 
or not. If the value is true then one can non-deterministically split the trace just before 
the state s. Hence, the evaluation becomes {eval{{Fi, s)) ■ F 2 ) V eval{{F 2 , s)) where V 
expresses the non-determinism. Otherwise, if the trace cannot be split, the evaluation 
becomes simply eval{{Fi, s)) ■ F 2 . The function update on the formula Fi • F 2 simply 
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updates the formula F \ , as is not effected by the trace that effects F\ . At the end of a 

trace, that Fi ■ F 2 is satisfied means that the remaining empty trace can be split into two 
empty traces satisfying respectively Fi and F 2 ; hence the conjunction in value{{Fi • F 2 )) . 
Since the semantics of Q is different at the beginning and at the end of a trace, we have 
to consider the two cases in the definition of value for Q)F. 

The operator 0 requires special attention. If a formula F is guarded by a previous 
operator then we evaluate F at every event and use the result of this evaluation in the 
next state. Thus, the result of evaluating F is required to be stored in some temporary 
placeholder so that it can be used in the next state. To allocate a placeholder for a 0 
operator, we introduce the operator Previous : Form x Form — Form . The second 
argument for this operator acts as the placeholder. We transform a formula 0 F at the 
beginning of monitoring as follows: 

0 F — >■ Previous (F'. valuel[F'))) where F' is the transformed version of F 
We define eval, update, and value for Previous as follows: 



evfl/(( Previous (F, past),s)) = evalllpast, s)) 

MP^afe(( Previous (F past),s)) = Previous i update ((F, s)),eval{{F, s))) 



value (( Prsvious (F, past))) 



false if at the beginning of trace 

value {{past}) if at the end of trace 



Here, eval of Previous ( F. past ) returns the eval of the second argument of Previous, 
that contains the evaluation of F in the previous state. In update we not only update the 
first argument F but also evaluate F and pass it as the second argument of Previous . 
Thus in the next state the second argument of Previous , past, is bound to 0 F. The 
value{{F')) that appears in the transformation of 0 F is the value of F' at the beginning 
of the trace. This takes care of the semantics of Eagle at the beginning of a trace. 



3.2 Monitor Synthesis for Rules 

In what follows, pb.H{b) denotes a recursive structure where free occurrences of 6 in iT 
point back to pb.Fl{b). Formally, pb.H{b) is a closed form term that denotes a fix-point 
solution to the equation x = H{x) and hence pb.Fl{b) = H{pb.F[{b)). The open form 
H (b) denotes a formula with free recursion variable b. In structural terms, a solution to 
X = H (x) can be represented as a graph structure where the leaves, denoted by x, point 
back to the root node of the graph. Our implementation uses this structural solution. 

We replace every rule R by an operator R during transformation. For a subformula 
R(F, P), where the rule R is defined as I max [ min i R(Form f,Tp) = B, we transform 
the subformula as follows: 

R(F,P) v^F^],P) 

where B' and F' are transformed versions of B and F respectively 
R(F, P') — >■ R(6, P') where R(F, F') is a subformula of B 

The second equation is invoked if a recursion is detected that is while transforming 
B if R(F, P') is encountered as a subformula of B. Note that here the variable b should 

^ A formal description of recursion detection mechanism can be found in [5]. 
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be a fresh name to avoid possible variable capturing. For example consider the formula 
□ (a; > 0 — >■ 3k{k = x A(^{z > 0Ay = k))). A specification for this monitor can be 
presented in Eagle as follows: 

max A ( Form /) = / A Qk-if) 
min Epf Form /) = / V © Ep(/) 
min Ev(int k) = Ep(z > 0 Ay = k) 
mon M = k{x > 0 — >■ Ev(cc)) 

The transformed version of M is as follows: 

M' = A{pbi.{{x > 0 ) — >■ 

Ev(p62.Ep(p63.((z > 0 ) A(y = k)V Previous (Ep (fa), false ))), x)l A Next (A(fci ))) 

The definitions of update, eval and value for R are as follows: 

update{{R{pb.H (b) , P) , s)) = R{pb' .update{{H (pb.H (b)) , s)) , P) 
update{{R{pb.H{b) , P), s)) = R(6', P) 

if R{pb.H{b), P) is a subformula of H{pb.H{b)) 

Here, pb.H{b) is first expanded to H{pb.H{b)) and then update is applied on it; that is, 
the body of the rule is updated. The second equation detects a recursion, that is, update 
of H{pb.H{b)) encounters R{pb.H{b), P) as a subformula of H{pb.H{b)). In that case 
R(6', F, P) is returned terminating the recursion. 

eval{{R{pb.H(b), P), s)) = eval{{H (pb.H {b))[p i— >■ evaKlP, s))], s)) 

Here, pb.H{b) is first expanded to H{pb.H{b)) and then any arguments of primitive 
type are evaluated and substituted in the expansion. The function eval is then applied 
on the expansion. Note that the result of eval{{P, s)), where P is an expression, may 
be a partially evaluated expression if expressions referred to by some of the variables 
in P are partially evaluated. The expression gets fully evaluated once all the variables 
referred to by the expressions are fully evaluated. 

valuel[R{B, P))) = false if R is minimal value{{R{B, P))) = true if R is maximal 

The value of a max rule is true and that of a min rule is false . 

For example, for a sequence of states sequence {x = 0,y = 3, z = 1}, {x = 0, y = 
5, z = 2}, {x = 2,y = 2, z = 0}, step-by-step monitoring of the formula M' on this 
sequence takes place as follows: 

Step 1: s = {x = 0, y = 3, z = 1} 

Fi = evaU[F, s)) 

= A{pbi.{{x > 0 )—)' 

Ev(/962.Ep(/963.((2 > 0 ) A (y = k) V Previous(Ep(fa), (3 = k)))), x)) A Next (A(6i ))) 

Observe that in the above step the second argument of Previous is partially evaluated as 
the value of k is not available. The value of k becomes available when we apply eval on 
Ev . At that time eval also replaces the free variable k appearing in the first argument of 
Ev by the actual value. This replacement can easily be seen if the definition of eval of 
Ev is written down. 
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Step 2 : s = {x = 0,y = 5, z = 2} 

F-2 = eval{{Fi,s)) 

= A(pbi.{{x > 0) — Ev(/o& 2 .Ep(p& 3 .(( 2 : > 0) A (y = fc)V 

Previous fEplfal. (3 = fc) V (5 = k)))),x)) A Next (A(6i ))) 

Step 3 : s = {x = 2, y = 2, z = 0} 

Fs = eval(lF 2 , s)) = false 

Thus the formula is violated on the third state of the trace. 



4 Implementation and Experiments 

We have implemented this monitoring framework in Java. The implemented system 
works in two phases. First, it compiles the specification file to generate a set of Java 
classes; a class is generated for each rule. Second, the Java class files are compiled into 
Java hytecode and then the monitoring engine runs on a trace; the engine dynamically 
loads the Java classes for rules at monitoring time. 

Our implementation of propositional logic uses the decision procedure of Hsiang 
[15]. The procedure reduces a tautological formula to the constant true, a false formula 
to the constant false, and all other formulas to canonical forms which are exclusive or 
(©) of conjunctions. The procedure is given below using equations that are shown to be 
Church-Rosser and terminating modulo associativity and commutativity. 

true A(j> = 4> false A 0 = false A {4>2 © </>3) = {(f>i A ^ 2 ) © {4>i A </>s) 

4 > /\ 4 > = 4 > false ® ( j > = ( j ) 01 V </)2 = A ^2) © 0i © 02 

0 © 0 = false -'0 = true © 0 0i ^ 02 = true © 0i © (0i A 02) 

01 = 02 = true © 01 © 02 

The above equations ensure that the size of a formula is small. In the translational phase, 
a Java class is generated for each rule in the specification. The Java class contains a 
constructor, a value method, an eval method, and a update method corresponding to 
the value, eval and update operators in the calculus. The arguments are made fields in 
the class and they are initialized through the constructor. The choice of generating Java 
classes for the rules was made in order to achieve an efficient implementation. To handle 
partial evaluation we wrap every Java expression in a Java class. Each of those classes 
contains a method isAvailable () that returns true whenever the Java expression 
representing that class is fully evaluated and returns false otherwise. The class also 
stores, as fields, the different other Java expression objects corresponding to the different 
variables (formula variables and state variables) that it uses in its Java expression. Once 
all those Java expressions are fully evaluated, the object for the Java expression evaluates 
itself and any subsequent call of isAvailable () on this object returns true. 

Once all the Java classes have been generated, the engine compiles all the generated 
Java classes, creates a list of monitors (which are also formulas) and starts monitoring 
all of them. During monitoring the engine takes the states from the trace, one by one, and 
evaluates the list of monitors on each to generate another list of formulas that become 
the new monitors for the next state. If at any point a monitor (a formula) becomes false 
an error message is generated and that monitor is removed from the list. At the end of 
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a trace the value of each monitor is calculated and if false, a warning message for the 
particular monitor is generated. The details of the implementation are beyond the scope 
of the paper. However, interested readers can get the tool from the authors. 

Eagle has been applied to test a planetary rover controller in a collaborative effort 
with other colleagues, see [2] for an earlier similar experiment using a simpler logic. 
The rover controller, written in 35,000 lines of C++, executes action plans. The testing 
environment, consists of a test-case generator, automatically generating input plans for 
the controller. Additionally, for each input plan a set of temporal formulas is generated 
that the plan execution should satisfy. The controller is executed on the generated plans 
and the implementation of Eagle is used to monitor that execution traces satisfy the 
formulas. The automated testing system found a missing feature that had been overlooked 
by the developers: the lower bounds on action execution duration were not checked by the 
implementation, causing some executions to succeed while they in fact should fail due 
to the too early termination of some action execution. The temporal formulas, however, 
correctly predicted failure in these cases. This error showed up later during actual rover 
operation before it was corrected. 



5 Conclusion and Future Work 

We have presented the succinct and powerful logic Eagle, based on recursive param- 
eterized rule dehnitions over three primitive temporal operators. We have indicated its 
power by expressing some other sophisticated logics in it. Initial experiments have been 
successful. Future work includes: optimizing the current implementation; supporting 
user-defined surface syntax; associating actions with formulas; and incorporating auto- 
mated program instrumentation. 
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Abstract. Abstraction and abstract interpretation are key tools for au- 
tomatically verifying properties of systems. One of the major challenges 
in abstract interpretation is how to obtain abstractions that are precise 
enough to provide useful information. 

In this talk, I will survey a parametric abstract domain called canoni- 
cal abstraction which was motivated by the shape analysis problem of 
determining “shape invariants” for programs that perform destructive 
updating on dynamically allocated storage. The shape analysis prob- 
lem was originally defined by [1]. Canonical abstraction was originally 
defined in [2]. A system for abstract interpretation based on abstract 
interpretation was defined in [3,4]. 

I will discuss properties that have been verified using this abstraction. A 
couple of interesting properties of this abstract domain will be presented. 
Finally, I will also show some of the limitations of this domain. 
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Abstract. The parametric shape analysis framework of Sagiv, Reps, 
and Wilhelm [45,46] uses three- valued structures as dataflow lattice ele- 
ments to represent sets of states at different program points. The recent 
work of Yorsh, Reps, Sagiv, Wilhelm [48, 50] introduces a family of for- 
mulas in (classical, two-valued) logic that are isomorphic to three-valued 
structures [46] and represent the same sets of concrete states. 

In this paper we introduce a larger syntactic class of formulas that has 
the same expressive power as the formulas in [48]. The formulas in [48] 
can be viewed as a normal form of the formulas in our syntactic class; 
we give an algorithm for transforming our formulas to this normal form. 
Our formulas make it obvious that the constraints are closed under all 
boolean operations and therefore form a boolean algebra. Our algorithm 
also gives a reduction of the entailment and the equivalence problems 
for these constraints to the satisfiability problem. 

Keywords: Shape Analysis, Program Verification, Abstract Interpreta- 
tion, Boolean Algebra, First-Order Logic, Model Checking 



1 Introduction 

Background. Shape analysis [46,32,22,20,12,16,15,9,37,27] is a technique for 
statically analyzing programs that manipulate dynamically allocated data struc- 
tures, and is important for precise reasoning about programs written in modern 
imperative programming languages. Parametric shape analysis [45,46] is a frame- 
work that can be instantiated to provide a variety of precise shape analyses. We 
can describe this approach informally as follows. The concrete program state is a 
two-valued structure, that is, a finite relational structure {U^, 6**), which maps, for 
example, a binary relation symbol r to a binary relation r®(r) : x U'^ ^ {0, 1}. 

To represent a potentially infinite set of concrete program states, [46] uses 
finite three-valued structures, which are relational structures in three-valued 
logic [24,38]. A three- valued structure {U,l) maps a binary relation symbol r 
to a three-valued relation t(r) : U x U ^ {{0}, {!}, {0, 1}}. Three-valued struc- 
tures generalize the graphs used in several previous shape analyses [44,22,9]. 

* This research was supported in part by DARPA Contract F33615-00-C-1692, NSF 
Grant CCROO-86154, NSF Grant CCROO-63513, and the Singapore-MIT Alliance. 
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The elements of the domain [/ of a three-valued structure {U, l) represent dis- 
joint non-empty sets of objects. Given two such sets A and B, we can com- 
pute the three-valued relation by L{r){A,B) = {L'^{r){a,b) \ a £ A A b £ B}. 
As observed in [48,50], the fact i,{r){A,B) = {0} means that the formula 
-•3x3y.A{x) A B{y) A r{x,y) holds on the two-valued structure Simi- 

larly, the fact i(r)(A, B) = {1} means that ~'3x3y.A{x) A B{y) A -•r{x, y) holds, 
whereas i,{r){A,B) = {0,1} means that both 3x3y.A{x) A B{y) A r{x,y) and 
3x3y.A{x) A B{y) A ~<r{x,y) hold. As a result, any three-valued structure can 
be described by a corresponding formula in first-order logic [50]. In this pa- 
per we take a closer look at the class of formulas that arise when charac- 
terizing the meaning of three-valued structures. We characterize such formu- 
las as the set of all boolean combinations of certain simple formulas, such as 
3x3y.A{x) A B{y) A r{x,y) (see Definition 5). As a result, we establish that 
the meaning of three valued structures (under the tight concretization seman- 
tics [50, Chapter 7]) is closed under all boolean operations and therefore forms 
a boolean algebra. 

Characterizing structures using formulas. The characterization of three- 
valued structures using formulas in first-order logic is presented for the first time 
in [48,50]. Section 3.1 of [48] explains that the semantics of general three- valued 
structures can represent the existence of graph coloring. As a result, first-order 
structures in general are not definable using first-order logic, but require the 
use of monadic second-order logic [48, Section 4]. However, an interesting class 
of three- valued structures can be represented using first-order logic [48, Section 
3.2], in particular, this is the case for bounded structures. Two versions of the 
semantics for three- valued structures are of interest: the standard concretization 
[45, Definition 3.5], [48, Chapter 3] and the tight concretization [48, Chapter 7] 
(the later corresponding to the canonical abstraction [45, Definition 3.6]). One 
can view the characteristic formulas for canonical abstraction of [48, Chapter 
7] as the starting point for the class of formulas in this paper: we show how to 
allow a richer syntactic class of formulas, and give an algorithm for converting 
these formulas to the characteristic formulas for canonical abstraction. 

We have previously studied regular graph constraints [29,28], inspired by 
the semantics of role analysis [25,27,26]. Regular graph constraints abstract 
the notion of graph summaries where nodes do not have a unique abstraction 
criterion. In [28, 29] we observe that such constraints can be equivalently char- 
acterized using graphs summaries and using existential monadic second-order 
logic formulas. Somewhat surprisingly, whereas the satisfiability of regular graph 
constraints is decidable [29, Section 2.4], the entailment and the equivalence of 
regular graph constraints are undecidable [29, Section 3], [28]. These properties 
of regular graph constraints are in contrast to the nice closure properties of the 
boolean shape analysis constraints of the present paper. 

1.1 Contributions 

Main result. The main result of this paper is a new syntactic class of formulas 
that characterize the meaning of three- valued structures under tight concretiza- 
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tion. The new syntactic class is defined as the set of all boolean combinations 
of formulas of a certain form. The proof of the equivalence of the new syntactic 
class and previously introduced characteristic formulas for canonical abstrac- 
tion [48] is a normalization algorithm that transforms formulas in our syntactic 
class to the characteristic formulas for canonical abstraction (which are isomor- 
phic to three- valued structures) . Our characterization immediately implies that 
the constraints expressible as the meaning of three-valued structures are closed 
under all boolean operations, we thus call them “boolean shape analysis con- 
straints” . 

Consequences of boolean closure. The resulting closure properties of 
boolean shape analysis constraints have several potential uses. The closure un- 
der disjunction is necessary for fixpoint computations in dataflow analysis and 
can easily be computed even for three-valued structures (by taking the union 
of sets of three- valued structures) . What our results show is that boolean shape 
analysis constraints are also closed under conjunction and negation. 

The conjunction of constraints is needed, for example, in compositional inter- 
procedural shape analysis, which computes the relation composition of relations 
on states. Conjunction allows the analysis to simultaneously retain the call-site 
specific information that the callee preserves across the call, and the postcondi- 
tion which summarizes the actions of the callee. 

The negation of constraints is useful for expressing deterministic branches in 
control-flow graphs. For example, an if statement with the condition c results in 
conjoining the dataflow fact d to yield dA c in the then branch, and dA ->c in the 
else branch. Similarly, the assert(c) statement, which is an important mech- 
anism for program specification, has (in the relational semantics) the condition 
-'C for the branch which leads to an error state. 

Finally, the closure under negation implies that both the implication and 
the equivalence of shape analysis constraints are reducible to the satisfiability 
of shape analysis constraints. The implication problem is important in compo- 
sitional shape analysis which uses assume/guarantee reasoning to show that a 
procedure conforms to its specification. 

Decidability of constraints. The closure of boolean shape analysis con- 
straints under boolean operations holds in the presence of arbitrary instrumen- 
tation predicates [46, Section 5]. What the particular choice of instrumentation 
predicates determines is whether the satisfiability problem for the constraints is 
decidable. If the satisfiability problem for three- valued structures with a partic- 
ular choice of instrumentation predicates is decidable, our normalization algo- 
rithm yields an algorithm for the satisfiability problem of formulas in the richer 
syntactic class, which, by closure under boolean operations, gives an algorithm 
for deciding the entailment and the equivalence of boolean shape analysis con- 
straints. 

Consequences for program annotations. The ability to write program 
annotations can greatly improve the effectiveness of static analysis, but the rep- 
resentation of program properties in the program analysis is often different from 
the representation of program properties that is appropriate for program anno- 
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tations. On the one hand, to synthesize invariants using fixpoint computation, 
program analysis often uses a finite lattice of program properties. On the other 
hand, program annotations should be expressed in some convenient, well-known 
notation, such as a variation of first-order logic. A program analysis that utilizes 
program specifications must bridge the gap between the analysis representation 
and the program annotations, for example, by providing a translation from a 
logic-based annotation language to the analysis representation. The translation 
from the full first-order logic to three-valued structures is equivalent to first- 
order theorem proving, and is therefore undecidable. Because we restrict our 
attention to formulas of a particular form, we are able to find a (complete and 
sound) decision procedure for generating three-valued structures that have the 
same meaning as these formulas.^ The existence of this information-preserving 
translation algorithm indicates that our formulas have the same expressive power 
as three-valued structures. Nevertheless, our formulas are more flexible than the 
direct use of three-valued structures (or formulas isomorphic to three-valued 
structures). For example, our formulas may use sets that are potentially inter- 
secting or empty, while the summary nodes of three- valued structures represent 
disjoint, non-empty sets of nodes. 

In addition to the benefits for writing program annotations, the richer syn- 
tactic class of formulas is potentially useful for analysis representations. A set 
of three- valued structures corresponds to a disjunctive normal form; alternative 
representations for three-valued structures may be more appropriate in some 
cases. 

2 Preliminaries 

We mostly follow the setup of [46] . Let A be a finite set of unary relation symbols 
(with a typical element A G A) and T a finite set of binary relation symbols 
(with a typical element f G J^). For simplicity, we consider only unary and 
binary relation symbols, which are usually sufficient for modelling dynamically 
allocated structures. A two-valued structure is a pair S'^ = where 

is a finite non-empty set (of “concrete individuals”), i^{A) G U'^ ^ {0; 1} for 
A G A, and r*(/) G (C/**)^ — >■ {0, 1} for f G T. Let 2-STRUCT be the set of all 
two-valued structures. A three-valued structure is a pair S = {U, i) where C/ is a 
finite non-empty set (of “abstract individuals”), l{A) G {7 — >• {{0}, {!}, {0, 1}} 
for A G A and and i(/) G ^ {{0}, {!}, {0, 1}} for f G T. Let 3-STRUCT 
denote the set of all three- valued structures. If is a two- valued structure and 
F a closed formula in first-order logic, then G {0, 1} denotes the truth- 

value of F in S'**, and 7 p(A) = {S* G 2-STRUCT | = 1} is the set of 

models of F. If C is a set of formulas, then models)^] = {'yp(F) \ F G C} is 
the set of sets of models of formulas from C. Let Ai C A he a finite subset of 
unary predicates. We call elements of Ai abstraction predicates. An Ai-bounded 

^ An alternative approach proposes the use of theorem provers to synthesize three- 
valued structures from arbitrary first-order formulas [49,49,40,41], [48, Chapter 6[. 
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structure is three- valued structure {U, i) for which the following two conditions 
hold: 1) i{A){u) G {{0},{!}} for all A & A\ and all tt G C/; 2) if Ui,U 2 G U and 
u\ yf U 2 then 6 (A) (ui) yf l{A){u 2 ) for some A & A\. The following definition of 
tight concretization corresponds to [48, Chapter 7], [45, Definition 3.6]. 

Definition 1 (Tight Concretization). Let = {U\l^) he a two-valued 
structure, let S = {U, i) he a three-valued structure, and let h : U'^ ^ U he a 
surjective total function. We write Qlf S iff 

1. for every A G A and u G U: 6 (A)(u) = { 6 **(A)(m**) | h{u^) = u}; 

2. for every f G T and Ui,U 2 G U: 

t(/)(Ml,M2) = { t**(/)(ui#,'U2*) I h(Mi#) = Ui A h{u2^)=U2} 

We write S'® Ct S iff there exists a surjective total function h such that S® 

S, and in that case we call h a homomorphism. The tight concretization of a 
three-valued structure S, denoted ’^t{S), is given hy: 7 t(S) = {S® | S® Ct 

S}. We extend jt to 7 ^. that acts on sets of three-valued structures so that 
the set denotes a disjunction: of sets of two- 

valued structures definable via three-valued structure with tight concretization is 
models[T 2 ] = { 7 ^.( 5 ) | S a finite set of Ai -bounded three-valued structures}. We 
call the set of sets models[T 2 ] boolean shape analysis constraints (the results of 
this paper justify to the name). 

If A G A and a G {0, 1} then A“ is defined by A^ = A and A° = -lA. 
A cube over Ai (or just “cube” for short) is an expression P{x) of the form 
A““^ (x) A ... A Ag"* (x) where oi, . . . , Og G {0, 1}. 

Definition 2 ( Ti?i-literal). Let P\{x),P 2 {x) range over cubes over A\, let A 
range over elements ofA\Ai, and let f range over T. A TRi-atomic-formula 
is a formula of one of the following forms: 

3x. Pi(x) 

3x. Pi(x) A A(x) 

3x. Pi{x) A -'A(x) 

3x3y. Pi(x) A P 2 {y) A f{x,y) 

3x3y. Pi{x) A P 2 {y) A^f{x,y) 

A TRi-literal is a TRi-atomic-formula or its negation. 

Ti?i-formulas correspond to formulas in [48, Chapter 7]. Tifi-formulas satisfy 
syntactic invariants that make them isomorphic to three-valued structures with 
tight-concretization semantics. 

Definition 3 ( Tifi-formulas). Let P{x), P\{x), P 2 {y) denote cubes over Ai. 
A canonical conjunction of TR\ literals is a conjunction of TRi-literals that 
satisfies all of the following properties: 

PI. for each P{x) a cube over Ai, exactly one of the conjuncts 3x.P{x) and 
~^3x.P{x) occurs; 
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P2. there is at least one cube P{x) such that the conjunct 3x.P{x) occurs in the 
conjunction; 

P3. if the conjunct ~^3x.P{x) occurs, then this conjunct is the only occurrence 
of the cube P{x) (and the cube P{y)) in the conjunction; 

Pj. for each cube P{x) such that 3x.P{x) occurs, and each A G ^\^i, exactly 
one of the following three conditions holds: 

1. -'3a;. P{x) A A{x) occurs in the conjunction, 

2. -'3a;. P{x) A ~'A{x) occurs in the conjunction, 

3. both 3a;. P{x) A A(x) and 3x. P{x) A -'^(a;) occur in the conjunction; 
P5. for every two cubes Pi{x) and ^ 2 ( 2 /) such that the conjuncts 3x.Pi{x) and 

3x.P2{x) occur, and for every f € IF, exactly one one of the following three 
conditions holds: 

1. ~^3x3y. P\{x) A P 2 {y) A f{x,y) occurs in the conjunction; 

2. -<3x3y. P\{x) A ^ 2 ( 2 /) A ~<f{x,y) occurs in the conjunction; 

3. both 3x3y. Pi{x) A ^ 2 ( 2 /) A f{x,y) and 3x3y. Pi{x) A ^ 2 ( 2 /) A ~^f{x,y) 
occur in the conjunction. 

A TRi-formula is a disjunction of canonical conjunctions of TRi-literals. 

A small difference between Ti?i -formulas and formulas in [48, Definition 7.3.3, 
Page 31] is that [48, Definition 7.3.3, Page 31] does not contain conjuncts of the 
form -'3a;.P(a;) stating the emptiness of each empty cube, but instead contains 
one conjunct of the form Vx. Vp where P ranges over all non-empty cubes. 

The following Proposition 4 shows that TRi formulas capture precisely the 
meaning of three-valued structures under tight concretization. The proof of 
Proposition 4 was first presented in [48, Appendix B] (and reviewed in [31, Page 
9]). The proof shows that TRi formulas and bounded three- valued structures 
can be viewed as different notations for the same mathematical structure. 

Proposition 4. models[Ti?i] = models[r 2 ] 

3 A New Characterization of Three-Valued Structures 

Definition 5 introduces the new syntactic class of formulas characterizing three- 
valued structures under tight concretization semantics. Theorem 6 gives a con- 
structive proof of the correctness of the characterization. 

Definitions ( TA 4 - formulas). Let Bi{x),B 2 {y) be range over arbitrary 
boolean combinations of elements of Ai, let Q{x) range over disjunctions of 
literals of form A{x) and ~'A{x) where A G A \ Ai, and let g{x,y) range over 
disjunctions of literals of the form f{x, y) and ~'f{x, y) where f € if. 

A TRi- atomic- formula is a formula of one of the following forms: 

1. 3x. Bi{x) 

2. 3x. Bi{x) A Q{x) 

3. 3x3y. Bi{x) A B 2 {y) A g{x,y) 
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3x.Bi{x) \/ B 2 {x) — >■ (3x.Bi{x)) \/ (3 x.B2{x)) 

3 x. (Bi(x) V B 2 (x)) a Q(x) —>■ (3x.Bi(x) A Q(x)) V (3x.B2(x) A Q(x)) 

3x. Bi(x) A (Qi(x) V Q 2 (x)) —>■ (3x.Bi(x) A Qi(x)) V (3x.Bi(x) A Q 2 (x)) 
3x3y. (Bii(x) V -812(2:)) A 82(1/) A g(x,y) -A 3x3y. Bu(x) A 82(1/) A g(x,y) V 

3x3y. Bi 2 (x) a 82(1/) A g(x,y) 

3x3y. Bi(x) A (821(1/) V 822(1/)) Ag(x,y) -s- 3x3y. Bi(x) A 821(1/) Ag(x,y) V 

3x3y. Bi(x) A B 22 (y) Ag(x,y) 

3x3y. Bi(x) A 82(1/) A (gi(x,y) V g 2 (x,y)) -s- 3x3y. Bi(x) A 82(1/) A pi (a:, 1/) A 

3x3y. Bi(x) A B 2 (y) Ag 2 (x,y) 



Fig. 1. Transforming r7?4-literals into TTJi-literals. 

Ensure each of the Properties of Definition 3 by applying the appropriate rules: 

PL (3x.P(x)) A (~i3x.P(x)) false 
true -A ( 3 a;.P(a:)) V (-' 3 a:. 8 (x)) 

P2. f\ -<3x.P{x) — / false 

PGcubes 

P3. {-'3x.P{x)) A (3x.P{x) A Q{x)) -A false 

(-i 3 a;. 8 (a:)) A (3x3y.P{x) A Q{x,y)) -a false 
(-^3x.P{x)) A {3x3y.P{y) A Q{x, y)) -A false 
{-^3x.P{x)) A {-i3x.P{x) A Q(x)) -A ~'3x.P(x) 

(~'3x.P(x)') A (~<3x3y.P(x) A Q(x, y)) -A ~<3x.P{x) 

{-^3x.P{x)) A {-i3x3y.P{y) A Q{x,y)) -A ~<3x.P(x) 

P4- (3x.P{x) A Q{x)) A {-^3x.P{x) A Q{x)) -A false 

{-^3x.P{x) A A{x)) A {-'3x.P{x) A -'A(x)) -A ^3x.P{x) 
true — >• {-^3x.P{x) A A{x)) V 
{~i3x.P{x) A ~^A{x)) V 
( 3 a;. 8 (a;) A A{x)) A {3x.P{x) A ~<A{x)) 

+rules for 83 

85 . {3x3y.Pi{x) A 82(1/) A Q{x,y)) A {-^3x3y.Pi{x) A 82(1/) A Q{x, y)) -A false 
(-. 3 a; 3 i/. 8 i(a;) A 82(1/) A /(a:,!/)) A (". 3 x 3 ?/. 8 i(a;) A 82(1/) A -./(a:, 1/)) -> 

(-.3x.8i(a:)) V (~i 3 i/. 82 (i/)) 

true -A {-^3x3y.Pi{x) A 82(1/) A f{x, yj) V 
(-. 3 a; 3 ?/. 8 i(a:) A82 (i/) A-./(a:,i/)) V 

( 3 a; 3 i/. 8 i(x)A 82 (i/) A/(a;,i/)) A (3a;3i/.8i(a:) A 82(1/) A -./(x, 1/)) 
+rules for 83 



Fig. 2. Transforming a conjunction of T8i-literals into a canonical conjunction of 
r8i-literals. 
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1. apply the rules in Figure 1 to transform the r7J4-formula into a boolean combina- 
tion of TRi-literals; 

2. transform the formula into a disjunction of conjunctions of TRi-literals; 

3. apply the rules in Figure 2 to transform each conjunction of TRi-literals into 
a canonical conjunction of TRi -literals, while keeping the formula in disjunctive 
normal form. 



Fig. 3. Normalization algorithm for transforming r7?4-formulas into TRi -formulas. 



A TR^-literal is a TRi-atomic-formula or its negation. A TRi-formula is a 
boolean combination of TR^-atomic- formulas. 

Theorem 6. Algorithm sketched in Figures 3 , 1 , 2 converts a TRj^- formula into 
an equivalent TRi-formula in a finite number of steps. 

Corollary 7 . models[Ti?4] = models[Ti?i] = models[T2]. 

By definition, Ti?4-formulas are closed under all boolean operations. 

Corollary 8. 1 . The family of sets models[T2] forms a boolean algebra of sets 

which is a subalgebra of the boolean algebra of all subsets o/ 2 -STRUCT. 

2 . There is an algorithm that constructs, given two finite sets of bounded three- 
valued structures Si and S2, a finite set of bounded three-valued structures 
S3 such that j}(Si) C 7^(52) iff 7 t(‘^s) = 0 - 

3 . There is an algorithm that constructs, given two finite sets of bounded three- 
valued structures Si and S2, a finite set of bounded three-valued structures 
S3 such that: 7|-(5i) = 7|-(52) iff 7^(53) = 0 . 

Note. Every Ti?4-formula with the set of abstraction predicates C is also 
a TRi formula with the set of abstraction predicates Ai = A. When A = Ai, 
then the class of Ti?4-formulas can be defined simply as boolean combinations 
of formulas 1 ) 3 x. Bi{x), and 2 ) 3 x 3 y. Bi{x) A B2{y) A g{x,y) where Bi{x), 
i?2(y) are boolean combinations of literals of the form A{x) and A{y) for A G A, 
and g{x,y) ranges over disjunctions of literals of the form f{x,y) and ~<f{x,y) 
for f G T. 

4 Decidability of Independent Predicates 

We next examine the decidability of the questions of the form: “Given sets A and 
T and a Ti?i-formula F over predicates A and T , is F satisfiable?” (Note that 
the sets A and T are part of the input to the decision procedure; for fixed finite 
sets A and T there are finitely many three- valued structures, so the decision 
problem would be trivial.) It turns out that satisfiability of Ti?i-formulas over 
2 -STRUCT is decidable because the family of TRi-formulas over 2 -STRUCT 
has small model property. It is easy to construct a model ([/•*, i**) of a canonical 
conjunction of TRi-literals by introducing at most two elements of the domain 
[/** for each non-empty cube. 
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Proposition 9. Let F be a canonical conjunction of TRi-literals and let the 
number of cubes P{x) over Ai such that 3x.P{x) occurs in F be n. Then there 
exists a two-valued structure S'** = {U^, 6**) such that |C/**| = 2n and F is true in 
S**. 

Corollary 10. =0 iff S = %. 

Corollary 11. The following questions are decidable for sets S\,S 2 of three- 
valued structures: 1) 7|.(5i) = 0; 2) 7y(Si) C 7^(52); 3) 7y(Si) = 75^(52). 

5 Structures with Defined Predicates 

Previous sections interpret three-valued structures and formulas over the set 
2-STRUCT of all two- valued structures. In general, it is useful to interpret three- 
valued structures and formulas over some subset 2-CSTRUCT C 2-STRUCT of 
compatible two- valued structures [46, Page 268]. The meaning of tight concretiza- 
tion with respect to 2-CSTRUCT is = 7t’( 5) fl 2-CSTRUCT and we let 

models[cT2] denote the set of all c^^{S) for all finite sets of bounded three- 
valued structures. To characterize the meaning of three-valued structures over 
2-CSTRUCT, for each class of formulas TRi we introduce the corresponding class 
cTRi by conjoining the formulas with a first-order formula F^ that characterizes 
the subset 2-CSTRUCT. It then follows models[cTRi] = {5** n 2-CSTRUCT | 
5** G models[TRi]}. Hence, models[cTR4] is a subalgebra of the boolean algebra 
of subsets of 2-CSTRUCT, its sets are subsets of elements of models[TR4], and 
the following generalization of Corollary 11 holds. 

Corollary 12. Assume that the satisfiability for three-valued structures inter- 
preted over 2-CSTRUCT using tight concretization is decidable. Then the entail- 
ment and the equivalence of three-valued structures interpreted over 2-CSTRUCT 
using tight concretization are also decidable. 

6 Further Related Work 

Researchers have proposed several program checking techniques based on 
dataflow analysis and symbolic execution [13,19,14,8,10,6,35]. The primary 
strength of shape analysis compared to the alternative approaches is the abil- 
ity to perform sound and precise reasoning about dynamically allocated data 
structures. 

Our work follows the line of shape analysis approaches which view the pro- 
gram as operating on concrete graph structures [46,22,9,32,20,16,15,37,27]. An 
alternative approach is to identify each heap object using the set of paths that 
lead to the object [12,18,7]. Other notations for reasoning about the heap include 
spatial logic [21] and alias types [47]. In the past we have seen a contrast between 
the approach to verification of dynamically allocated data structures based on 
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Hoare logic [37,21, 14], and the approach based on manipulation of graph sum- 
maries [32,9,22,16]. The work [20] and especially [45] are important steps in 
bringing these two views together. Along with the recent work [40,49,50,48] our 
paper makes further contributions to unifying these two approaches. 

The parametric framework for shape analysis is presented in [45,46]. A sys- 
tematic presentation of three- valued logic with equality is given in [38]. A de- 
scription of a three- valued logic analyzer TVLA is in [33], an extension to inter- 
procedural analysis is in [43,42], and the use of shape analysis for program veri- 
fication is demonstrated in [34] . A finite differencing approach for automatically 
computing transfer functions for analysis is presented in [39]. A shape analysis 
tool must ultimately take into account the definitions of instrumentation predi- 
cates, which requires some form of theorem proving or decision procedures. The 
original work [46, Page 272] uses rules based on Horn clauses for such reasoning, 
whereas [40,49,50,48] (see Section 1) propose the use of theorem provers and 
decision procedures. In this paper we have identified one component of the prob- 
lem that is always decidable and useful: it is always possible to reduce entailment 
and equivalence problems to the satisfiability problem. Of great importance for 
taking advantage of our result, as well as the results of [40,49,50,48], are decid- 
able logics that can express heap properties. Among the promising such logics 
are monadic second-order logic of trees [23] , the logic Lr [4] , and role logic [30] . 

It is possible to apply predicate abstraction techniques [3,2,17] to perform 
shape analysis; the view of three-valued structures as boolean combinations of 
constraints of certain form may be beneficial for this direction of work and 
enable the easier application of representations such as binary decision diagrams 
[11,5,36]. The boolean algebra of state predicates and predicate transformers 
has been used successfully as the foundation of refinement calculus [1]. In this 
paper we have identified a particular subalgebra of the boolean algebra of all 
state predicates; we view this boolean algebra as providing the foundation of 
shape analysis. 



7 Conclusions and Future Work 

We have presented a new characterization of the constraints used as dataflow 
facts in parametric shape analysis. Our characterization represents these 
dataflow facts as boolean combinations of formulas. Among the useful conse- 
quences of the closure of boolean shape analysis constraints under all boolean 
operations is the fact that the entailment and the equivalence of these constraints 
is reducible to the satisfiability of the constraints. 

In this paper we have focused on the tight-concretization semantics of three- 
valued structures. In the full version of the paper [31] we additionally show 
similar results for standard-concretization semantics of three-valued structures, 
with one important difference: the resulting constraints are closed under conjunc- 
tion and disjunction, but not under negation. In fact, the least boolean algebra 
generated by those constraints is precisely the boolean algebra of boolean shape 
analysis constraints. 
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We view the results of this paper as a step in further understanding of the 
foundations of shape analysis. To make the connection with [46], this paper starts 
with three-valued structures and proceeds to characterize the structures using 
formulas. An alternative approach is to start with formulas that express the de- 
sired properties and then explore efficient ways of representing and manipulating 
these formulas. We believe that the entire framework [46] can be reformulated 
using canonical forms of formulas instead of three-valued structures. We also 
expect that the idea of viewing dataflow facts as canonical forms of formulas is 
methodologically useful in general, especially for the analyses that verify complex 
program properties. 

Acknowledgements. The results of this paper were inspired in part by the 
discussions with Andreas Podelski, Thomas Reps, Mooly Sagiv, Greta Yorsh, and 
David Schmidt. We thank Greta Yorsh and Darko Marinov for useful comments 
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Appendix: Example 

Figure 4 illustrates the use of boolean shape analysis constraints and their closure 
properties. The left column introduces a set of constraints that provide a partial 
specification of an operation that removes some elements from a list. The right 
column shows that the conjunction of these constraints is a Ti? 4 -formula. In this 
example Ai = A = {Ai,Ar, Ao} and T = {/}. The predicate Ai represents the 
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object referenced by a local variable. The predicate denotes the set of nodes 
reachable from the local variable in the initial state, whereas the predicate 
denotes the set of nodes reachable from the local variable in the final state of the 
operation. The binary relation / represents the value of the “next” pointer of 
the list in the final state. The meaning of the constraints is the following: Ci) the 
first element of the list has no incoming references; C 2 ) the list has at least two 
elements; C 3 ) the object referenced by local variable is reachable from the local 
variable in both pre- and post- state; C 4 ) following reachable nodes along the / 
field yields reachable nodes; C 5 ) all nodes are reachable; Cq) the data structure 
operation only removes elements from the set, it does not add any elements. 

Consider the question whether the formula C 7 in Figure 4 is a consequence 
of the conjunction of constraints A^=i Transform the formula -iCr A Ai=i C'i 
into a Ti?i -formula using our normalization algorithm in Figure 3. The result is 
the set of counterexamples in Figure 5 (represented as three- valued structures). 
These counterexamples show that the formula Cy is not a consequence of the 
constraints in Figure 4, and show the set of scenarios in which the violation of 
formula Cy can occur. 



Cl : yy.Aiiy) ^ -<3x.f{x,y) 

C 2 : 3x3y.Ai{x) A ~^Ai{y) A f{x, y) 
C 3 : yX-Ali^x) Ar{x) A Aro{x) 

C 4 ■■ 'ix'iy.Arix) A f{x, y) ^ Ar{y) 
C 5 : yX.Alix) V Ar{x) V Aroix) 

Ce : yX.Ar{x) ^ Aro{x) 



-^3x3y.Ai{y) A f{x,y) 
3x3y.Ai{x) A ^Ai{y) A f{x, y) 
~^3x.~^{Al{x) => Ar{x) A Aroix)) 
■~3x3y.Ar{x) A -^Ar(y) A f{x, y) 
-3x.-^{Al{x) V Ar{x) V Aro{x)) 
~’3x.^(Ar(x) ^ Aro{x)) 



Cy : 



yxyy.-^Ai(x) A -^Ar{x) A f{x, y) ^ ~^Ar{y) 



Fig. 4. Example verification of entailment of boolean shape analysis constraints. 




Fig. 5. Counterexample structures for the entailment of constraints in Figure 4. There 
are 2 • 3 • 3 counterexample three-valued structures, for different values of edges. 
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Abstract. Symbolic model checking methods have been extended re- 
cently to the verification of probabilistic systems. However, the represen- 
tation of the transition matrix may be expensive for very large systems 
and may indnce a prohibitive cost for the model checking algorithm. In 
this paper, we propose an approximation method to verify quantitative 
properties on discrete Markov chains. We give a randomized algorithm to 
approximate the probability that a property expressed by some positive 
LTL formula is satisfied with high confidence by a probabilistic system. 
Our randomized algorithm requires only a succinct representation of the 
system and is based on an execution sampling method. We also present 
an implementation and a few classical examples to demonstrate the ef- 
fectiveness of our approach. 



1 Introduction 

In this paper, we address the problem of verifying quantitative properties on 
discrete time Markov chains (DTMC). We present an efficient procedure to ap- 
proximate model checking of positive LTL formulas on probabilistic transition 
systems. This procedure decides if the probability of a formula over the whole 
system is greater than a certain threshold by sampling finite execution paths. 
It allows us to verify monotone properties on the system with high confidence. 
For example, we can verify a property such as : “the probability that the mes- 
sage sent will be received without error is greater than 0.99” . This method is an 
improvement on the method described in [16]. 

The main advantage of this approach is to allow verification of formulas even 
if the transition system is huge, even without any abstraction. Indeed, we do 
not have to deal with the state space explosion phenomenon because we verify 
the property on only one finite execution path at a time. This approach can be 
used in addition to classical probabilistic model checkers when the verification 
is intractable. 

Our main results are: 

— A method that allows the efficient approximation of the satisfaction proba- 
bility of monotone properties on probabilistic systems. 
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~ A tool named APMC that implements the method. We use it to verify ex- 
tremely large systems such as the Pnueli and Zuck’s 500 dining philosophers. 

The paper is organized as follows. In Section 2, we review related work on 
probabilistic verification of qualitative and quantitative properties. In Section 3, 
we consider fully probabilistic systems and classical LTL logic. In Section 4, we 
explain how to adapt the main idea of the bounded model checking approach 
to the probabilistic framework. In Section 5, we present a randomized algorithm 
for the approximation of the satisfaction probability of monotone properties. In 
Section 6, we present our tool and give experimental results and compare them 
with the probabilistic model checker PRISM [7]. 

2 Related Work 

Several methods have been proposed to verify a probabilistic or a concurrent 
probabilistic system against LTL formulas. Vardi and Wolper [26,27] developed 
an automata theoretical approach for verifying qualitative properties stating 
that a linear time formula holds with probability 0 or 1. Pnueli and Zuck [22] 
introduced a model checking method for this problem. 

Courcoubetis and Yannakakis [5] studied probabilistic verification of quan- 
titative properties expressed in the linear time framework. For the fully proba- 
bilistic case, the time complexity of their method is polynomial in the size of the 
state space, and exponential in the size of the formula. For the concurrent case, 
the time complexity is linear in the size of the system, and double exponential 
in the size of the formula. 

Hansson and Jonsson [9] introduced the logic PCTL (Probabilistic Computa- 
tion Tree Logic) and proposed a model checking algorithm for fully probabilistic 
systems. They combined reachability-based computation, as in classical model 
checking, and resolution of systems of linear equations to compute the proba- 
bility associated with the until operator. For concurrent probabilistic systems. 
Bianco and de Alfaro [3] showed that the minimal and maximal probabilities 
for the until operator can be computed by solving linear optimization problems. 
The time complexity of these algorithms are polynomial in the size of the system 
and linear in the size of the formula. 

There are a few model checking tools that are designed for the verification of 
quantitative specifications. ProbVerus [8] uses PCTL model checking and sym- 
bolic techniques to verify PCTL formulas on fully probabilistic systems. PRISM 
[7,15] is a probabilistic symbolic model checker that can check PCTL formulas 
on fully or concurrent probabilistic systems. Reachability-based computation 
is implemented using BDDs, and numerical analysis may be performed by a 
choice between three methods: MTBDD-based representation of matrices, con- 
ventional sparse matrices, or a hybrid approach. The Erlang en-Twente Markov 
Chain Checker [10] {E h MC'^) supports model checking of continuous-time 
Markov chains against specifications expressed in continuous-time stochastic 
logic (CSL). Rapture, presented in [6] and [12] uses abstraction and refinement 
to check a subset of PCTL over concurrent probabilistic systems. 
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In [28], Younes and Simmons described a procedure for verifying properties of 
discrete event systems based on Monte-Carlo simulation and statistical hypoth- 
esis testing. This procedure uses a refinement technique to build statistical tests 
for the satisfaction probability of CSL formulas. Their logic framework is more 
general than ours, but they cannot predict the sampling size, in contrast with 
our approximation method in which this size is exactly known and tractable. 
Rabinovich [24] gives an algorithm to calculate the probability that a property 
of a probabilistic lossy channel system is satisfied. Monniaux [18] defined ab- 
stract interpretation of probabilistic programs to obtain over-approximations 
for probability measures. We use a similar Monte-Carlo method to approximate 
quantitative properties. 



3 Probabilistic Transition Systems 

In this section, we introduce the classical concepts for the verification of proba- 
bilistic systems. 

Definition 1. A Discrete Time Markov Chain (DTMC) is a pair A4 = (S,P) 
where S is a finite or enumerable set of states and P : S' x S' — >■ [0, 1] zs o 
transition probability function, i.e. for all s € S, '^t&s ~ is finite, 

we can consider P to be a transition matrix. 

The notion of DTMC can be extended to the notion of probabilistic transition 
system by adding a labeling function. 

Definition 2. A fully probabilistic transition system (PTS) is a structure A4 = 
(S,P,I,L) where (S,P) is a DTMC, I is the set of initial states and L : S ^ 
V{AP) a function which labels each state with a set of atomic propositions. 

Definition 3. A path a of a PTS is a finite or infinite sequence of states 
(so, si, . . . , Si, . . . ) such that P{si, Si+i) > 0 for all i > 0. 

We denote by Path{s) the set of paths whose first state is s. We note also 
a{i) the (z-l- l)-th state of path a and ct* the path (a{i),a{i + l), . . .). The length 
of a path a is the number of states in the path and is denoted by |cr|, this length 
can be infinite. 

Definition 4. For each PTS A4 and state s, we may define a probability mea- 
sure Prob on the set Path{s). Prob denotes here the unique probability measure 
on the Borel field of sets generated by the basic cylinders 
{cf /(T is a path and{so, si, . . . , Sn) is a prefix of a} where 
Prob{{a / a is a path and{so, si, . . . , s„) is a prefix of a}) = 0”=! P{si-i, Si). 

Definition 5. Let a be a path of length k in a PTS Ai. The satisfaction of a 
LTL formula on a is defined as follows: 

— M, a \= a iff a G i(cr(0)). 

— M, a \= ^(j) iffM,a ^ f. 

— A4, a \= (f> Alp iff M, a \= f and M, a ]='?/’. 
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— A4, a j= X</> ijf A4,a^ \= (j> and | u |> 0. 

— A4,a \= iff there exists 0 < j < k s.t. A4,a^ \= 'ip and for all i < j 

M,a^ \= 4>. 

We now introduce a fragment of LTL which expresses only monotone prop- 
erties. 

Definition 6. The essentially positive fragment (EPF) of LTL is the set of for- 
mulas built from atomic formulas (p), their negations (~'p), closed under V, A 
and the temporal operators X, U . 

Definition 7. Let Pathffs) he the set of all paths of length k in a PTS starting 
at s € L. The probability of a LTL formula (p on Pathffs) is the measure of 
paths satifying (p (as stated in Definition 5) in Pathffs). 

Definition 8. An LTL formula <p is said to be monotone if and only if for all 
k, for all paths a of length k, Ai,a \= (p Ai,a'^ \= 4>, where is any path 
of which a is a prefix. 

In [26], it is shown that for any LTL formula <p, probabilistic transition system 
M and state s, the set of paths {cr/cr(0) = s and At, cr |= (p} is measurable. We 
denote by Prob[(p] the measure of this set. 

4 Probabilistic Bounded Model Checking 

In this section, we review the classical framework for bounded model checking 
of linear time temporal formulas over transition systems. Then, we show that 
we cannot directly extend this approach but we use the main idea of checking 
formulas on paths of bounded length to approximate the target satisfaction 
probability. 

Biere, Cimatti, Clarke and Zhu [4] present a symbolic model checking tech- 
nique based on SAT procedures instead of BDDs. They introduce bounded model 
checking (BMC), where the bound correspond to the maximal length of a possible 
counterexample. First, they give a correspondence between BMC and classical 
model checking. Then they show how to reduce BMC to propositional satisfia- 
bility in polynomial time. 

To check the initial property (p, one should look for the existence of a coun- 
terexample to (p, that is a path satisfying ft = -i(p for a given length k. In [4], 
the following result is also stated: if one does not find such a counterexample for 
fc < [S'] X where S is the set of states, then the initial property is true. We 
cannot hope to find a polynomial bound on k with respect to the size of S and 
Ip unless NP=PSPACE, since the model checking problem for LTL is PSPACE- 
complete (see [25]) and such a bound would yield a polynomial reduction to 
propositional satisfiability. 

We try to check Prob[ip] > bhy considering Probk[ip] > b, i.e., on the proba- 
bilistic space limited to the set of paths of length k. Following the BMC approach, 
we could associate to a formula ip and length k a propositional formula ipk in 
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such a way that a path of length k satisfying ip corresponds to an assignment 
satisfying Thus determining Probk[ip] could be reduced to the problem of 
counting the number of assignments satisfying a propositional formula, called 
if SAT [20]. Unfortunately, not only are no efficient algorithms known for such 
counting problems, but they are believed to be strongly intractable (see, for in- 
stance [20]). However, it is not necessary to do such a transformation since we 
can evaluate directly the formula on one finite path. In the following, we use this 
straightforward evaluation instead of SAT-solving methods. 

For many natural formulas, truth at length k implies truth in the entire 
model. These formulas are the so-called monotone formulas (see definition 8). 
We consider the subset {EPF definition 6) of LTL formulas which have this 
property. 

EPF formulas include nested compositions of U but do not allow for nega- 
tions in front. Nevertheless, this fragment can express various classical properties 
of transition systems such as reachability, livelock-freeness properties and con- 
vergence properties of protocols. 

Proposition 1. Let (j) be a LTL formula. If (f G EPF, then 4> is monotone. 

The proof of this proposition is immediate from the structure of the formula. 

The monotonicity of the property defined by an EPF formula gives the 
following result. 

Proposition 2. For any formula <j) of the essentially positive fragment of LTL, 
0 < b < I and fc G N, if Probk[fi] > b, then Prob[4i] > b. 

Indeed, the probability of an EPF formula to be true in the bounded model 
of depth k is less or equal than the probability of the formula in any bounded 
model of depth greater than k. 

This proposition can be extended to any monotone formula but we restrict 
our scope to EPF formulas to make our method fully automatic. 

5 Approximate Probabilistic Model Checking 

In order to calculate the satisfaction probability of a monotone formula, we have 
to verify the formula on all paths of length k. Such a computation is intractable 
in general since there are exponentially many paths to check. Thus, it is natural 
to ask: can we approximate Probk[4>]I In this section, we propose an efficient 
procedure to approximate this probability. The running time of this computation 
is polynomial in the length of paths and the size of the formula. 

In order to estimate the probabilities of monotone properties with a simple 
randomized algorithm, we generate random paths in the probabilistic space un- 
derlying the DTMC structure of depth k and compute a random variable A/N 
which estimates Probk ['*/'] • To verify a statement Probk [fj] > b, we test whether 
A/N > b — e. Our decision is correct with confidence (1 — (5) after a number of 
samples polynomial in I and log |. This result is obtained by using Chernoff- 
Hoeffding bounds [11] on the tail of the distribution of a sum of independent 
random variables. The main advantage of the method is that we can proceed 
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with just a succinct representation of the transition graph, that is a succinct 
description in an input language, for example Reactive Modules [1]. 

Definition 9. A succinct representation, or diagram, of a PTS A4 = {S, P, I, L) 
is a representation of the PTS, that allows to generate algorithmically, for any 
state s, the set of states t such that P{s,t) > 0. 

The size of such a representation is in the same order of magnitude as the 
PTS. Typically, for Reactives Modules, the size of the diagram is in 0{n.p), 
when the size of the PTS is in O(n^). 

In order to prove our result, we introduce the notion of fully polynomial 
randomized approximation scheme (FPRAS) for probability problems. This no- 
tion is analogous to randomized approximation schemes [13,19] for counting 
problems. Our probability problem is defined by giving as input x a succinct 
representation of a probabilistic system, a formula and a positive integer k. The 
succinct representation is used to generate a set of execution paths of length k. 
The solution of the probability problem is the probability measure p,{x) of the 
formula over the set of execution paths. The difference with randomized approx- 
imation schemes for counting problems is that for approximating probabilities, 
which are rational numbers in the interval [0, 1], we only require approximation 
with additive error. 

Definition 10. A fully polynomial randomized approximation scheme (FPRAS) 
for a probability problem is a randomized algorithm A that takes an input x, two 
real numbers 0 < e, 5 < 1 and produces a value A{x, e, S) such that: 

Prob[\A{x,e,S) — p,{x)\ < e] > 1 — (5. 

The running time of A is polynomial in \x\, ^ and log 

The probability is taken over the random choices of the algorithm. We call e: 
the approximation parameter and 5 the confidence parameter. By verifying the 
formula on 0(^.log j) paths, we obtain an answer with confidence (1 — <5). 

Consider the following randomized algorithm designed to approximate 
Probklif], that is the probability of an LTL formula over bounded DTMC of 
depth k: 



Generic approximation algorithm QAA 

Input: diaqram,ib,k,£,5 
iV:=41og(|)/s2 
A := 0 

For t = 1 to fV do 

1. Generate a random path a of length k with the diagram 

2. If f) is true on a then A := A + 1 
Return A/N 



Theorem 1. The generic approximation algorithm QAA is a fully randomized 
approximation scheme for the probability p = Probk[f:] for an LTL formula ip 
and p g]0, 1[. 
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Proof. The random variable A is the sum of independent random variables with 
a Bernouilli distribution. We use the Chernoff-Hoeffding bound [11] to obtain 
the result. Let Xi, be N independent random variables which take value 

1 with probability p and 0 with probability (1 — p), and Y = ■ Then 

the Chernoff-Hoeffding bound gives Prob\\Y — p\ > e] < 2e~^^ . In our case, if 
N > 41og(|)/£^, then Prob[\A/N — p| < e] >1 — 5 where p = Pro6fe[t/)]. 

The time needed to verify if a given path verifies ip is polynomial in the 
size of the formula. The number N of iterations is polynomial in - and log j . So 
QAA is a fully polynomial randomized approximation scheme for our probability 
problem. 

This algorithm provides a method to verify quantitative properties expressed 
by EPF formulas. To check the property Probk[ip] > b, we can test if the result 
of the approximation algorithm is greater than b — e. If Probk[ip] > bis true, then 
the monotonicity of the property guarantees that Prob[ip] > bis true. Otherwise, 
we increment the value of k within a certain bound to conclude that Prob[ip] ^ b. 

The main problem of the method is to determine a bound on the value of 
k. Unfortunately, this bound might be exponential in the numbers of states, 
even for a simple reachability property. The bound is strongly related to the 
cover time of the underlying Markov chain. The problem of the computation of 
the cover time is known to be difficult when the input is given as a succinct 
representation [17]. 

An other way to deal with the value of k is to shrink our attention to formulas 
with bounded Until rather than classical Until. With this hypothesis, we can set 
k to the maximum time bound in some subformulas of the specification. But 
this is not completely satisfactory since we cannot handle general properties 
with only bounded until. 

Now, let us discuss the parameters e and 5. The complexity of the algorithm 
depends on log (1/5), this allows us to set 5 to very small values. In our experi- 
ments, we set 5 = 10“^°, which seems to be a reasonable confidence ratio. 

The dependance in e is much more crucial, since the complexity is quadratic in 
\/e. We set e = 10“^ in our experiments because this is the value that allows 
the best tradeoff between verification quality and time. 

6 APMC: An Implementation 

In this section, we present some experimental results of our approximate model 
checking method. These results were obtained with a tool we developed. This 
tool, APMC, works in a distributed framework and allows the verification of ex- 
tremely large systems such as the 300 dining philosophers problem. We compare 
the performance of our method to the performance of PRISM. These results are 
promising, showing that large systems can be approximately verified in seconds, 
using very little memory. 

APMC (Approximate Probabilistic Model Checker) is a GPL (Gnu Public 
License) tool written in C with lex and yacc. It uses a client/server computation 
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model (described in Subsection 6.2) to distribute path generation and verification 
on a cluster of machines. 

APMC is simple to use: the user enters an LTL formula and a description of 
a system written in the same variant of Reactive Modules as used by PRISM. 
The user enters the target satisfaction probability for the property, the length 
of the paths to consider and the approximation and confidence parameters e 
and 5. These parameters can be changed through a Graphical User Interface 
(GUI), represented in Figure I. These are the basic parameters, there are ad- 
vanced parameters such as the choice of a specific strategy for the speed/space 
compromise to use, but one can use a “by default” mode which is sufficiently 
efficient in general. After this, the user clicks on “go” and waits for the result. 
APMG is a fully automatic verification tool. 




Fig. 1. The Graphical User Interface. 



6.1 Standalone Use and Comparison with PRISM 

We first consider a classical problem from the PRISM examples library [23]: the 
dining philosophers problem. Let us quickly recall the problem: n philosophers 
are sitting around a table, each philosopher spends most of its time thinking, 
but sometimes gets hungry and wants to eat. To eat, a philosopher needs both 
its right and left forks, but there are only n forks shared by all philosophers. 
The problem is to find a protocol for the philosophers without livelock. Pnueli 
and Zuck [21] give a protocol that is randomized. We ran experiments on a fully 
probabilistic version of this protocol (that is, a DTMG version) : there are no non- 
deterministic transitions and the scheduling between philosophers is randomized. 
For this protocol, we checked the following liveness property: “If a philosopher 
is hungry, then with probability one, some philosopher will eventually eat”. This 
property guarantees that the protocol is livelock free. The following table shows 
our results using APMG and those of PRISM (model construction and model 
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checking time) on one 1.8 GHz Pentium 4 workstation with 512 MB of memory 
under the Linux operating system. For this experiment, we let £ = 10“^ and 
S= 10-1°. 



number of phil. 


length 


APMC (time in sec.) 


PRISM (time in sec.) 


PRISM (states) 


3 


20 


35 


0.394 


770 


5 


23 


56 


0.87 


64858 


10 


30 


125 


11.774 


4.21 X lO’* 


15 


42 


242 


64.158 


2.73 X lO"'^ 


20 


50 


387 


137.185 


1.77 X 10"° 


25 


55 


531 


2469.56 


1.14 X lO^"* 


30 


65 


823 


ont of mem. 


out of mem. 


50 


130 


3579 


ont of mem. 


out of mem. 


100 


148 


8364 


out of mem 


out of mem. 



On this example, we see that we can handle larger systems than PRISM, 
more than 30 philosophers for Pnueli and Zuck’s philosophers, without having 
to construct the entire model which contains lO^'^ states for 25 philosophers. 
Note that during the computation, our tool uses very little memory. This is due 
to the fact that the verification process never stores more than one path at a 
time. 



6.2 Cluster Use 

In the previous subsection, we used APMC on a single machine, but to increase 
the efficiency of the verification, APMC can distribute the computation on a 
cluster of machines using a client/server architecture. 

Let us briefly describe the client/server architecture of APMC. The model, 
formula and other parameters are entered by the user via the Graphical User In- 
terface which runs on the server (master). Both the model and formula are trans- 
lated into C source code, compiled and sent to clients (the workers) when they 
request a job. Regularly, workers send current verification results, receiving an 
acknowledgment from the master, to know wether they have to continue or stop 
the computation. Since the workers only need memory to store the generated 
code and one path, the verification requires very little memory. Furthermore, 
since each path is verified independently, there is no problem of load balancing. 
Figure 4 shows the scalability of the implementation on Pnueli and Zuck’s din- 
ing philosophers algorithm for 25 philosophers: computation time is divided by 
two when we double the size of the cluster. This is a consequence of very low 
communications overhead in the computation. 

We used APMC to check properties of several fully probabilistic systems 
modeled as DTMCs. In Figure 2, we consider Pnueli and Zuck’s Dining Philoso- 
phers algorithm [21] for which we verify the liveness property and in Figure 3, 
we consider a fully probabilistic version of the randomized mutual exclusion of 
Pnueli and Zuck [21]. All the experiments were done with a cluster of 20 workers 
(all are ATHLON XP1800-I- under Linux) with e = 10“^ and S = 10“^°. 
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phil. 


length 


time (sec.) 


max. memory (KBytes) 


15 


38 


11 


324 


25 


55 


25 


340 


50 


130 


104 


388 


100 


145 


418 


484 


200 


230 


1399 


676 


300 


295 


4071 


1012 



Fig. 2. Dining philosophers: run-time and memory for 20 workers. 



proc. 


length 


time (sec.) 


max. memory (KBytes) 


3 


120 


13 


316 


5 


250 


35 


328 


10 


520 


146 


408 


15 


1000 


882 


548 


20 


1400 


1499 


660 



Fig. 3. Mutual exclusion: run-time and memory for 20 workers. 



We are able to verify very large systems using a reasonable cluster of workers 
and very little memory for each of them. In an additional experiment, with an 
heterogeneous cluster of 32 machines, we were able to verify the Pnueli and 
Zuck’s 500 philosophers in about four hours. 




Fig. 4. Scalability of the implementation: time vs. workers for 25 dining philosophers. 
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7 Conclusion 

To our knowledge, this work is the first to apply randomized approximation 
schemes to probabilistic model checking. We estimate the probability with a 
randomized algorithm and conclude that satisfaction probabilities of EPF for- 
mulas can be approximated. This fragment is sufficient to express reachability 
and livelock-freeness properties. Our implementation was used to investigate the 
effectiveness of this method. Our experiments point to an essential advantage of 
the method: the use of very little memory. In practice, this means that we are 
able to verify very large fully probabilistic models, such as the dining philoso- 
pher’s problem with 500 philosophers. This method seems to be very useful when 
classical verification is intractable. 
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Abstract. For every finite model M and an LTL property (p, there ex- 
ists a number CT (the Completeness Threshold) such that if there is no 
counterexample to (p in M of length CT or less, then M \= ip. Finding 
this number, if it is sufficiently small, offers a practical method for mak- 
ing Bounded Model Checking complete. We describe how to compute 
an over-approximation to CT for a general LTL property using Biichi 
automata, following the Vardi-Wolper LTL model checking framework. 
Based on the value of CT, we prove that the complexity of standard 
SAT-based BMC is doubly exponential, and that consequently there is 
a complexity gap of an exponent between this procedure and standard 
LTL model checking. We discuss ways to bridge this gap. 

The article mainly focuses on observations regarding bounded model 
checking rather than on a presentation of new techniques. 



1 Introduction 

Bounded Model Checking (BMC) [2,3] is a method for finding logical errors, or 
proving their absence, in finite-state transition systems. It is widely regarded as 
a complementary technique to symbolic BDD-based model checking (see [3] for 
a survey of experiments with BMC conducted in industry) . Given a finite tran- 
sition system M, an LTL formula p and a natural number k, a BMC procedure 
decides whether there exists a computation in M of length k or less that violates 
p. SAT-based BMC is performed by generating a propositional formula, which 
is satisfiable if and only if such a path exists. BMC is conducted in an iterative 
process, where k is incremented until either (i) an error is found, (ii) the prob- 
lem becomes intractable due to the complexity of solving the corresponding SAT 
instance, or (iii) k reaches some pre-computed threshold, which indicates that 
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M satisfies (p. We call this threshold the Completeness Threshold, and denote it 
by CT. CT is any natural number that satisfies 

M Kt M 

where M ip denotes that no computation of M of length CT or less violates 
p. Clearly, ii M \= p then the smallest CT is equal to 0, and otherwise it is 
equal to the length of the shortest counterexample. This implies that finding the 
smallest CT is at least as hard as checking whether M \= p. Consequently, we 
concentrate on computing an over-approximation to the smallest CT based on 
graph-theoretic properties of M (such as the diameter of the graph represent- 
ing it) and an automaton representation of ->p. In particular, we consider all 
models with the same graph-theoretic properties of M as one abstract model. 
Thus, this computation corresponds to finding the length of the longest shortest 
counterexample to p by any one of these graphs, assuming at least one of them 
violates p^ . Thus, when we say the value of CT in the rest of the paper, we refer 
to the value computed by this abstraction. 

The value of CT depends on the model M, the property p (both the structure 
of p and the propositional atoms it refers to), and the exact scheme used for 
translating the model and property into a propositional formula. The original 
translation scheme of [2], which we will soon describe, is based on a /c-steps 
syntactic expansion of the formula, using Pnueli’s expansion rules for LTL [8, 
12] (e.g., Fp = p V XFp). With this translation, the value of CT was until now 
known only for unnested properties such as Gp formulas [2] and Fp formulas 
[1 1] . Computing CT for general LTL formulas has so far been an open problem. 

In order to solve this problem we suggest to use instead a semantic translation 
scheme, based on Biichi automata, as was suggested in [6]^. The translation is 
straightforward because it follows very naturally the Vardi-Wolper LTL model 
checking algorithm, i.e., checking for emptiness of the product of the model 
M and the Biichi automaton representing the negation of the property <j). 
Non-emptiness of M x B^,p, i.e., the existence of a counterexample, is proven by 
exhibiting a path from an initial state to a fair loop. We will describe in more 
detail this algorithm in Section 3. Deriving from this product a propositional 
formula that is satisfiable iff there exists such a path of length k or less 

is easy: one simply needs to conjoin the fc-unwinding of the product automaton 
with a condition for detecting a fair loop. We will give more details about this 
alternative BMC translation in Section 4. For now let us just mention that due 
to the fact that Cip{k) has the same structure regardless of the property p, it 
is easy to compute CT based on simple graph-theoretic properties of M x 
Furthermore, the semantic translation leads to smaller CNF formulas comparing 
to the syntactic translation. There are two reasons for this: 

^ If this assumption does not hold, e.g., when is a tautology, the smallest threshold 
is of course 0. 

^ The authors of [6] suggested this translation in the context of Bounded Model Check- 
ing of infinite systems, without examining the implications of this translation on 
completeness and complexity as we do here. 
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1 . The semantic translation benefits from the existing algorithms for construct- 
ing compact representations of LTL formulas as Biichi automata. Such op- 
timizations are hard to achieve with the syntactic translation. For example, 
the syntactic translation for FFp results in a larger propositional formula 
compared to the formula generated for Fp, although these are two equivalent 
formulas. Existing algorithms [14] generate in this case a Biichi automaton 
that corresponds to the second formula in both cases. 

2. The number of variables in the formula resulting from the semantic transla- 
tion is linear in k, comparing to a quadratic ratio in the syntactic translation. 

This paper is mainly an exposition of observations about bounded model check- 
ing^ rather than a presentation of new techniques. In particular, we show how 
to compute CT based on the semantic translation; prove the advantages of this 
translation with respect to the size of the resulting formula as mentioned above, 
both theoretically and through experiments; and, finally, we discuss the question 
of the complexity of BMC. In Section 5 we show that due to the fact that CT 
can be exponential in the number of state variables, solving the corresponding 
SAT instance is a doubly exponential procedure in the number of state variables 
and the size of the formula. This implies that there is a complexity gap of an 
exponent between the standard BMC technique and LTL model checking. We 
suggest a SAT-based procedure that closes this gap while sacrificing some of 
the main advantages of SAT. So far our experiments show that our procedure 
is not better in practice than the standard SAT-based BMC, although future 
improvements of this technique may change this conclusion. 



2 A Translation Scheme and Its Completeness Threshold 

2.1 Preliminaries 

A Kripke structure M is a quadruple M = (S,I,T,L) such that (i) S is the set 
of states, where states are defined by valuations to a set of Boolean variables 
(atomic propositions) At (ii) / C S' is the set of initial states (iii) T C S x S is 
the transition relation and (iv) L: S ^ is the labeling function. Labeling 
is a way to attach observations to the system: for a state s G S the set L{s) 
contains exactly those atomic propositions that hold in s. We write p{s) to denote 
p G L{s). The initial state / and the transition relation T are given as functions 
in terms of At. This kind of representation, frequently called functional form, 
can be exponentially more succinct comparing to an explicit representation of 
the states. This fact is important for establishing the complexity of the semantic 
translation, as we do in Section 4. 

Propositional Linear Temporal Logic (LTL) formulas are defined recursively: 
Boolean variables are in LTL; then, if ip\,‘P 2 G LTL then so are F(/?i (Future), 
G(/?i (Globally), X(/?i (neXt), ‘p\\Jip 2 {<Pi Until (^ 2 ) and ip{W (p 2 {<pi Waiting-for 
1 ^ 2 ), V and 

^ Some of these observations can be considered as folk theorems in the BMC commu- 
nity, although none of them to the best of our knowledge was previously published. 
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2.2 Bounded Model Checking of LTL Properties 

Given an LTL property (p, a Kripke structure M and a bound k, Bounded 
Model Checking is performed by generating and solving a propositional formula 
f2ip(k) : [[ -M ]]^ A [[ -up ]]j, where [[ M ]]j, represents the reachable states up to 
step k and [[ ]]fc specifies which paths of length k violate (p. The satisfiability 
of this conjunction implies the existence of a counterexample to p. For example, 
for simple invariant properties of the form Gp the BMC formula is 

k—1 k 

i— 0 i—0 

where the left two conjuncts represent [[ M ]]^, and the right conjunct represents 

[[-‘pL- 

There are several known methods for generating [[ ]]^ [2,3,7]. These trans- 

lations are based on the LTL expansion rules [8,12] (e.g., Fp = p V XFp and 
Gp = pAXGp). 

In the rest of this section we consider the translation scheme of Biere et al. 
[3] given below. This translation distinguishes between finite and infinite paths 
(for the latter it formulates a path ending with a loop) . For a given property, it 
generates both translations, and concatenates them with a disjunction. 

Constructing a propositional formula that captures finite paths is simple. The 
formula is expanded k times according to the LTL expansion rules mentioned 
above, where each subformula, at each location, is represented by a new variable. 
For example, for the operator F, the expansion for i < k is [[ F<p ]]^ := [[ p ]]^ V 
[[ ¥p ]]^~''^ and for z > fc [[ ¥p ]]^ := false. Similar rules exist for the other 
temporal operators and for the propositional connectives. 

To capture paths ending with a loop (representing infinite paths) we need to 
consider the state si {I < k), which the last state transitions to. The translation 
for the operator F for such paths is: J[ Fi^ ]]^ := J[ ]]^ V J[ ¥p where 

succ(z) = z -I- 1 if z < A:, and succ(z) = I otherwise. 

Finally, in order to capture all possible loops we generate /[[ 'F 1°) 

where ^Lk = {si = Sk), i.e., an expression that is true iff there exists a back 
loop from state Sk to state s;. Each expression of the form J[ ]]^ or [[ ]]^ is 

represented by a new variable. The total number of variables introduced by this 
translation is quadratic in k. More accurately: 

Proposition 1. The syntactic translation results in 0{k ■ |z;| -I- (fc -I- 1)^ • [v?]) 
variables, where v is the set of variables defining the states of M , i.e., |z;| = 
0{log\S\). 

Proof. Recall the structure of the formula Qip{k) : [[ Tf ]]^ A [[ ~^p ]]^. The sub- 
formula [[ M ]]^ adds 0{k-\v\) variables. The sub-formula [[ -'p ]]^, adds, according 
to the recursive translation scheme, not more than (fc-|-l)^- \p\ variables, because 
each expression of the form J[ (^ ]]^ is a new variable, and both indices z and I 
range over 0 . . . fc. Further, each subformula is unfolded separately, hence leading 
to the result stated above. 
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2.3 A Completeness Threshold for Simple Properties 

There are two known results regarding the value of CT, one for Gp and one for 
Fp formulas. Their exposition requires the following definitions. 

Definition 1. The diameter of a finite transition system M, denoted by d{M), 
is the longest shortest path (defined by the number of its edges) between any two 
reachable states of M. 

The diameter problem can be reduced to the ‘all pair shortest path’ problem, 
and therefore be solved in time polynomial in the size of the graph. In our case, 
however, the graph itself is exponential in the number of variables. Alternatively, 
one may use the formulation of this problem as satisfiability of a Quantified 
Boolean Formula (QBF), as suggested in [2], and later optimized in [1,13]. 

Definition 2. The recurrence diameter of a finite transition system M , denoted 
by rd{M) is the longest loop-free path in M between any two reachable states. 

Finding the longest loop-free path between two states is NP-complete in the size 
of the graph. One way to solve it with SAT was suggested in [2]. The number of 
variables required by their method is proportional to the length of the longest 
loop-free path. Hence, the SAT instance may have an exponential number of 
variables, and finding a solution to this instance is doubly exponential. 

We denote by d^ {M) and rd^ (M) the initialized diameter and recurrence 
diameter, respectively, i.e., the length of the corresponding paths when they are 
required to start from an initial state. 

For formulas of the form Fp (i.e., counterexamples to G->p formulas), Biere et 
al. suggested in [2] that CT is less than or equal to d{M) (it was later observed by 
several researchers independently that in fact d^{M) is sufficient). For formulas 
of the form Gp formulas (counterexamples to F-ip formulas), it was shown in 
[11] that CT is equal to rd^{M). Computing CT for general LTL formulas, as 
was mentioned in the introduction, has so far been an open problem. 

In the next section we review how LTL model checking can be done with 
Biichi automata. In Section 4 we will show how a similar method can be used 
for generating 

3 LTL Model Checking with Biichi Automata 

In this section we describe how model checking of LTL formulas can be done 
with Biichi automata, as it was first introduced by Vardi and Wolper in [15]. A 
labeled Biichi automaton M = (S', Sq, S, L, F) is a 5-tuple where S is the set of 
states. So C S is a set of initial states, <5 C (S x S) is the transition relation, 
L is a labeling function mapping each state to a Boolean combination of the 
atomic propositions, and F C S is the set of accepting states. The structure 
of M is similar to that of a finite-state automaton, but M is used for deciding 
acceptance of infinite words. Given an infinite word w, w G C{M) if and only if 
the execution of ru on M passes an infinite number of times through at least one 
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of the states in F. In other words, if we denote by inf{w) the set of states that 
appear infinitely often in the path of w on M, then inf{w) C\ F ^ 

Every LTL formula Lp can be translated into a Biichi automaton such 
that accepts exactly the words (paths) that satisfy (p. There are several 
known techniques to translate p to We do not repeat the details of this 

construction; rather we present several examples in Fig. 1 of such translations. 







FG/ GF/ F/ G/ 

Fig. 1. Several LTL formulas and their corresponding Biichi automata. Accepting 
states are marked by double circles. 



LTL model-checking can be done as follows: Given an LTL formula p, con- 
struct B^ip, a Biichi automaton that accepts exactly those paths that violate p. 
Then, check whether F = M x B^^ is empty. It is straightforward to see that 
M \= if a and only if F is empty. Thus, LTL model checking is reduced to the 
question of Biichi automaton emptiness, i.e., proving that no word is accepted 
by the product automaton F. In order to prove emptiness, one has to show that 
no computation of F passes through an accepting state an infinite number of 
times. Consequently, finding a reachable loop in F that contains an accepting 
state is necessary and sufficient for proving that the relation M Y= P holds. Such 
loops are called fair loops. 

4 The Semantic Translation 

The fact that emptiness of F is proven by finding a path to a fair loop gives us a 
straightforward adaptation of the LTL model checking procedure to a SAT-based 

Most published techniques for this translation construct a generalized Biichi automa- 
ton, while in this article we use a standard Biichi automaton (the only difference 
being that the former allows multiple accepting sets). The translation from general- 
ized to standard Biichi automaton multiplies the size of the automaton by up to a 
factor of \p\. 
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BMC procedure. This can be done by searching for a witness to the property 
tp' = G(true) under the fairness constraint \/p.^pFi [5] (that is, \/pppFi 
should be true infinitely often in this path). Thus, given F and k, we can use the 
standard BMC translation for deriving f2^{k), a SAT instance that represents 
all the paths of length k that satisfy p' . Finding such a witness of length k or 
less is done in BMC by solving the propositional formula: 

k—1 k — 1 / ^ \ 

Qq,{k) = I{so) A /y T{si, Si+i) A \J j (s; = Sk) A V V «(».) (1) 

i=0 1=0 \ j=lFieF j 

The right-most conjunct in Equation 1 constrains one of the states in F to be 
true in at least one of the states of the loop. 

Since the Biichi automaton used in this translation captures the semantics of 
the property rather than its syntactic structure, we call this method a semantic 
translation for BMC. 

We continue by proving the two advantages of this translation: the efficiency 
of the translation and the ease of computing CT. 



4.1 The Semantic Translation Is More Efficient 

The semantic translation has a clear advantage in terms of the size of the re- 
sulting formula (in terms of the number of variables), as stated in the following 
proposition (compare to Proposition 1). 

Proposition 2. The semantic translation results in 0{k ■ (|t>| -I- |<p|)) variables. 

Proof. The transition relation of the Biichi automaton constructed from p is 
defined by 0(|:/?|) variables (the automaton itself is exponential in the size of 
the formula, but its corresponding representation by a transition relation is as 
defined above) . The SAT formula is constructed by unfolding k times the prod- 
uct F, hence it uses 0{k ■ (|u| -I- |(^|)) variables. It also includes constraints for 
identifying a loop with a fair state, but these constraints only add clauses, not 
new variables. □ 

We conducted some experiments in order to check the difference between the 
translations. We conducted this experiment with NuSMV 2.1, which includes 
an optimized syntactic translation [4]. To generate the semantic translation, we 
derived the Biichi automaton with wring [14] and added the resulting automa- 
ton to the NuSMV model. Then the property to be checked is simply F(false) 
under possible fairness constraints, as prescribed by the Biichi automaton. The 
table in Figure 2 summarizes these results. 

As can be seen from the table and from Figure 3, there is a linear growth 
in the number of variables in the resulting CNF formula with the semantic 
translation, and a quadratic growth with the syntactic translation. Furthermore, 
the last three formulas have redundancy that is removed by WRING, but is not 
removed with the syntactic translation (observe that formulas 2 and 3 result in 
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Property 


K 


Semantic 


Syntactic 


(a:oU(!2;o A xi))\5x2 


7 


1090 


986 




15 


2298 


3098 




30 


4563 


14073 




45 


6828 


39898 


FFFx2 


7 


528 


569 




15 


1112 


1321 




30 


2207 


3076 




45 


3302 


5281 


FFFFFFa :2 


7 


528 


632 




15 


1112 


Timeout 




30 


2207 






45 


3302 




GFGFa:2 


7 


586 


590 




15 


1234 


1426 




30 


2449 


3511 




45 


3664 


Timeout 



Fig. 2. The number of variables in the CNF formula resulting from the semantic and 
syntactic translation. The former grows linearly with k, while the latter may grow 
quadratically. The Timeout entry indicates that it takes NuSMV more than 15 minutes 
to generate the CNF formula. 




Fig. 3. The number of variables for the formula (a:oU(!a;o A xi))\Jx 2 with the semantic 
and syntactic translations. 



the same number of variables in the semantic translation, but not in the syntactic 
translation). The model which we experimented with was a toy example, and 
hence the resulting CNF was easy to solve with both translations. But it was 
sufficient for demonstrating the differences between the translations and that 
in some cases even generating the CNF formulas takes a long time with the 
syntactic translation. 



Completeness and Complexity of Bounded Model Checking 



93 



4.2 A Calculation of CT for LTL Based on M X 

A major benefit of the semantic translation is that it implies directly an over- 
approximation of the value of CT : 

Theorem 1. A completeness threshold for any LTL property (p when using 
Equation 1 is min(rd^(if') -I- +d{T')). 

Proof, (a) We first prove that CT is bounded by -I- d('f'). If M ^ then 

T is not empty. The shortest witness for the non-emptiness of S' is a path 
So, . . . , s f, . . . Sk where sq is an initial state, s/ is an accepting state and Sk = si 
for some I < f. The shortest path from sq to s/ is not longer than d^{T'), and 
the shortest path from Sf back to itself is not longer than d{'E). (b) We now 
prove that CT is also bounded by rd^T) + 1 (the addition of 1 to the longest 
loop-free path is needed in order to detect a loop). Falsely assume that M ^ ip 
but all witnesses are of length longer than rd^ (T) + 1. Let W : sq, . . . , s f, . . . Sk 
be the shortest such witness. By definition of rd^ (T), there exists at least two 
states, say Si and sj in this path that are equal (other than the states closing 
the loop, i.e., Si yf Sk). If i,j < / or i,j > f then this path can be shortened 
by taking the transition from Sj to Sj+i (assuming, without loss of generality, 
that i < j), which contradicts our assumption that W is the shortest witness. If 
i < f < j, then the path W : sq, . . . , s f , . . . , sj is also a loop witnessing M p, 
but shorter than W, which again contradicts our assumption. □ 

The left-hand side drawing below demonstrates a case in which d^ (T) + d{T) > 
rd^ (T) + 1 (d^('F) = dlfT) = rd^ {T) = 3), while the right-hand side drawing 
demonstrates the opposite case (in this case d^('f') = d{'E) = 1, rd^ (E) -|- 1 = 5). 
These examples justify taking the minimum between the two values. 




An interesting special case is invariant properties (Gp). The Biichi automaton 
for the negation of this property (F-ip) has a special structure (see third drawing 
in Fig 1): for all M, any state satisfying -ip in the product T : M x p leads to 
a fair loop. Thus, to prove emptiness, it is sufficient to search for a reachable 
state satisfying -ip. A path to such a state cannot be longer than d^{T). More 
formally: 

Theorem 2. A completeness threshold for Fp formulas, where p is non- 
temporal, is d^{'E). 

We believe that this theorem can be extended to all safety properties. 
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5 The Complexity of BMC 

According to Theorem 1, the value of CT can be exponential in the number 
of state variables. This implies that the SAT instance (as generated in both the 
syntactic and semantic translations) can have an exponential number of variables 
and hence solving it can be doubly exponential. All known SAT-based BMC 
techniques, including the one presented in this article, have this complexity. Since 
there exists a singly exponential LTL model checking algorithm in the number of 
state variables, it is clear that there is a complexity gap of an exponent between 
the two methods. Why, then, use BMC for attempting to prove that M \= Lp 
holds? There are several answers to this question: 

1. Indeed, BMC is normally used for detecting bugs, not for proving their ab- 
sence. The number of variables in the SAT formula is polynomial in /c. If the 
property does not hold, k depends on the location of the shallowest error. If 
this number is relatively small, solving the corresponding SAT instance can 
still be easier than competing methods. 

2. In many cases the values of rd^ {^P) and d^{'P) are not exponential in the 
number of state variables, and can even be rather small. In hardware circuits, 
the leading cause for exponentially long loop- free paths is counters, and hence 
designs without counters are much easier to solve. For example, about 25% 
of the components examined in [1] have a diameter smaller than 20. 

3. For various technical reasons, SAT is not very sensitive to the number of vari- 
ables in the formula, although theoretically it is exponential in the number of 
variables. Comparing it to other methods solely based on their correspond- 
ing complexity classes is not a very good indicator for their relative success 
in practice. 

We argue that the reason for the complexity gap between SAT-based BMC and 
LTL model checking (as described in Section 3), is the following: SAT-based 
BMC does not keep track of visited states, and therefore it possibly visits the 
same state an exponential number of times. Unlike explicit model checking it 
does not explore a state graph, and unlike BDD-based symbolic model checking, 
it does not memorize the set of visited states. For this reason, it is possible 
that all paths between two states are explored, and hence a single state can be 
visited an exponential number of times. For example, an explicit model checking 
algorithm, such as the double DFS algorithm [10], will visit each state in the 
graph below not more than twice. SAT-based BMC, on the other hand, will 
consider in the worst case all 2" possible paths between s and t, where n is the 
number of ‘diamonds’ in the graph. 




A natural question is whether this complexity gap can be closed, i.e., is 
it possible to change the SAT-based BMC algorithm so it becomes a singly 
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exponential rather than a doubly exponential algorithm. Figure 4 presents a 
possible singly exponential BMC algorithm for Gp formulas (i.e., reachability) 
based on an altered SAT algorithm that can be implemented by slightly changing 
a standard SAT solver. The algorithm forces the SAT solver to follow a particular 
variable ordering, and hence the main power of SAT (guidance of the search 
process with splitting heuristics) is lost. Further, it adds constrains for each 
visited state, forbidding the search process from revisiting it through longer or 
equally long paths. This potentially adds an exponential number of clauses to 
the formula. 



1. Force a static order, following a forward traversal. 

2. Each time a state i is fully evaluated (assigned): 

— Prevent the search from revisiting it through deeper paths, e.g.. If (xi,^yi) is 
a visited state, then for i < j < CT add the following blocking state clause: 

hxj V yj)- 

— When backtracking from state i, prevent the search from revisiting it in step 
i by adding the clause {-iXi V yi). 

— If ->pi holds, stop and return ‘Counterexample found’. 



Fig. 4. A singly exponential SAT-based BMC algorithm for Gp properties. 



So far our experiments show that this procedure is worse in practice than the 
standard BMC^. Whether it is possible to find a singly exponential SAT-based 
algorithm that works better in practice than the standard algorithm, is still an 
open question with a very significant practical importance. 



6 Conclusions 

We discussed the advantages of the semantic translation for BMC, as was first 
suggested in [6]. We showed that it is in general more efficient, as it results in 
smaller CNF formulas, and it potentially eliminates redundancies in the property 
of interest. We also showed how it allows to compute the completeness threshold 
CT for all LTL formulas. 

The ability to compute CT for general LTL enabled us to prove that all 
existing SAT-based BMC algorithms are doubly exponential in the number of 
variables. Since LTL model checking is only singly exponential in the number of 
variables, there is a complexity gap between the two approaches. In order to close 
this gap, we suggested a revised BMC algorithm that is only singly exponential, 
but in practice, so far, has not proved to be better than the original SAT based 
BMC. 

® A. Biere implemented a similar algorithm in 2001 and reached the same conclusion. 
For this reason he did not publish this algorithm. Similar algorithms were also used 
in the past in the context of Automatic Test Pattern Generation (ATPG). 
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Abstract. We use hidden algebra as a formal framework for object 
paradigm. We introduce a labeled transition system for each object 
specification model, and then define a suitable notion of bisimulation 
over these models. The labeled transition systems are used to dehne 
CTL models of object specifications. Given two hidden algebra models 
of an object specification, the bisimilar states satisfy the same set of 
CTL formulas. We build a canonical CTL model directly from the 
object specification. Using this CTL model, we can verify the temporal 
properties using a software tool allowing SMV model checking. 

Keywords: hidden algebra, object specihcation, labeled transition sys- 
tems, behavioral bisimulation, CTL models, model checking, SMV. 



1 Introduction 

Hidden algebra [4] is a theoretical framework originally aiming at giving se- 
mantics to objects. It generalizes algebraic specification by making a clear sep- 
aration between visible and hidden types, the first staying for data while the 
second for object states. Hidden algebra has been successfully used to formally 
specify, simulate, and verify behavioral correctness of object systems. One of its 
distinctive features is the relation of behavioral equivalence, also known as “in- 
distinguishability under experiments” , which states that two (states of) objects 
are behaviorally equivalent as far as they cannot be distinguished by any of the 
experiments that one can or is interested to perform on the system. 

The aim of the paper is to provide an automatic method to verify the tem- 
poral properties for systems specified in hidden algebra. We use CTL to express 
the temporal properties , and we define the satisfaction relation using labeled 
transition systems derived from hidden models. We present a way to verify these 
properties working directly over the specifications, independent of hidden mod- 
els. The method can be automated using a software tool and SMV model checker. 

The paper is organized in the following way. In section 2 we shortly present 
hidden algebra, specification of objects in hidden algebra, and the BOBJ system. 
We show how the mutual exclusion is described in BOBJ using the concurrent 
connection of object specifications. In section 3 we introduce the labeled transi- 
tion system for each object model. Then we define the behavioral bisimulation 
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and present a nontrivial example. In section 4 we recall some basics of the Com- 
putational Tree Logic (CTL), and define the CTL models of the object speci- 
fications. We prove that bisimilar states satisfy the same set of CTL formulas. 
Section 5 describes a canonical CTL model of an object specification able to 
support model checking for temporal properties. We present briefly a procedure 
taking an object specification as input, and producing an SMV description of the 
canonical CTL model as output. In this way we extend the equational behav- 
ioral approach by adding a new dimension to the object specifications, namely 
the temporal requirements checkable by using our procedure and SMV. A more 
detailed presentation of this approach is given in [7]. 

2 Specification of Objects in Hidden Algebra 

We assume that the reader is familiar with algebraic specification. A detailed 
presentation of hidden algebra can be found in [9,4]. Here we remind the main 
concepts and notations. A (fixed data) hidden signature consists of two disjoint 
sets: V of visible sorts, and H of hidden sorts; a many sorted {V U it)-signature 
A; and an A[y-algebra D called data algebra. Such a hidden signature is denoted 
by A, and its constituents are denoted by H{S), V{S), and A(A) respectively. 

Given a hidden signature (V, H, A, A), a hidden S -model is a A-algebra M 
such that M\s\y= D. This means that the interpretation of each sort s G V UiL 
is a distinct set Ms, and the interpretation of a symbol / G Agj...s„_s is a function 
I/]m : Ms^ X • • • X — >• Mg. By M\s\^ we denote the algebra M restricted 
only to the visible sorts and visible operations. A hidden S -homomorphism h : 
M — >■ M' is a A-homomorphism such that h\siy= idi). 

Given a hidden signature (V, H, A, D) and a subsignature A C A such that 
r\v= S\v, a F-eontext for sort s is a term in 7r({_ : s} 1+J A) having exactly 
one occurrence of a special variable _ of sort s. A is a an infinite set of distinct 
variables. Cr[-' s] denotes the set of all A-contexts for a sort s. If c G Cr[- : s], 
then the sort of the term c is called the result sort of the context c. A A-context 
with visible result sort is called a F -experiment. If c € Cr[- ■ s] with the result 
sort s' and t G Ts{X)s, then c[t] denotes the term in Ts{var{c) U X) obtained 
from c by substituting t for _ . Furthermore, for each hidden A-model M, c is a 
map |c]m : Mg — >■ — >• Mg/] defined by |c]M(a)(d) = a,j(c), where a^(c) 

is the variable assignment {_ >->■ a} U {2 i?(z) | z G var{c)}. We call |c]m the 

interpretation of the context c in M. 

Given a hidden signature (V,i7, A,A), a subsignature A C A such that 
A [y= A [y and a hidden A-model M, the F-behavioral equivalence on M, 
denoted by =£, is defined as follows: 

for any sort s G V U iL and any a, a' G Mg, 

a =£ a' iff |c]M(a)(^^) = IclM(a')(^) 

for all A-experiments c and all (V U i7)-sorted maps ft : var(c) — >■ M. 

Given an equivalence ~ on M, an operation / G Ag^ . s^^s is congruent wrt ~ 
iff Ifliviiai,... ,a„) ~ whenever at ~ a' for i = l,n. An 

operation / G A is F-behaviorally congruent wrt M iff it is congruent wrt =£. 
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A hidden F -congruence on M is a U iJ)-equivalence which is the identity on 
visible sorts and each operation in F is congruent wrt it. 

Theorem 1. [4,9] Given a hidden signature (V, FI, F, D), a subsignature F C S 
such that F\v= F\v and a hidden E-model M, then F-behavioral equivalence 
is the largest hidden F-congruence on M. 

A hidden A’-model M F -behaviorally satisfies a A'-equation e of the form 
{yX)t = t' ±f C if and only if for all i? : X — >■ M, ■d(t) =£ idit') whenever 
■d(u) =£ ^{v) for all u = V in C. We write M e. If if is a set of A'-equations, 
then we write M if iff M e for all e in E. 

An behavioral specification is a triplet B = {E, F, E) consisting of a hidden E- 
signature, a subsignature F C E such that F\y= E\y and a set of 17-equations. 
We often denote the constituents of B by E{B),F{B) and respectively E{B). The 
operations in T\ (A'I'y) are called behavioral. A hidden i7-model M behaviorally 
satisfies the specification iff M T-behaviorally satisfies E, that is M E. 
We write M ^ B and we say that M is a B -model. For any equation e, we write 
^ e iff M ^ implies M ^ e. An operation f G E is behaviorally congruent 
wrt iff / is T-behaviorally congruent wrt each ,B-model M. 

Behavioral specifications can be used to model concurrent objects as follows. 
B specifies a simple object iff: 

1. F[{B) has a unique element h called state sort; 

2. each operation f G E \ E\y is either: 

- a hidden (generalized) constant which models an initial state, or 

- a method f : hv\ • ■ ■ Vn ^ h with Vi G V , for i = 1, n, or 

- an attribute f : hv\ • • • — >■ w with v GV and Vi G V , for i = 1, n. 

In other words, the framework for simple objects is the monadic fixed-data 
hidden algebra [9]. A concurrent connection B\\\ ■ ■ ■ \\Bn is defined as in [5] where 
the (composite) state sort is implemented as tupling. If hi is the state sort of 
Bi, then a composite state is a tuple {Si, . . . , Sn) ■ Tuple where the state Si is 
of sort hi. Projection operations i*, i = l,n, are defined by i*{{Si, . . . , Sn) = Si 
together with the “tupling equation” (1*T, . . . ,n*T) = F for each T of sort 
Tuple. We assume that all the specifications B\, . . . ,Bn share the same data 
algebra. By object specification we mean either a simple object specification, or 
a conservative extension of a concurrent connection of object specifications. 

A similar approach of behavioral specifications is given in [6] . There the be- 
havioral attributes are called direct observers and the behavioral methods are 
called indirect observers. However, there are some subtle differences between the 
two approaches, e.g. the definition for signature morphisms, but these differences 
are not essential for our approach. We have chosen hidden algebra because there 
is the software tool BOBJ able to execute behavioral specifications and to prove 
properties expressed in terms of behavioral equivalences. 

BOBJ system [2] is used for behavioral specification, computation, and ver- 
ification. BOBJ extends OBJ3 with support for behavioral specification and 
verification, and in particular, it provides circular coinductive rewriting with 
case analysis for conditional equations over behavioral theories. All examples in 
this paper are presented in BOBJ. 
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Example 1. Critical Region. A critical region is a section of code executed un- 
der mutual exclusion. A solution for mutual exclusion is provided by semaphores. 
The data algebra includes the process status values and the semaphore values: 
dth DATA is 

sorts ProcStatus SemVal . 

ops idle blocked critical : -> ProcStatus . 
ops 0 1 : -> SemVal . 
end 

A process is specified in a rather abstract way as a simple object with just 
one method, that changes the status of the process, and just one attribute that 
returns the status: 

bth PROCESS is 

sort Process . inc DATA . 

op setSt : Process ProcStatus -> Process . 

op getSt : Process -> ProcStatus . 

var P : Process . var S : ProcStatus . 
eq getSt (setSt (P , S)) = S . 
end 

Here we consider only two processes: 

bth PRl is inc PROCESS * (sort Process to Prl) . end 

bth PR2 is inc PROCESS * (sort Process to Pr2) . end 

A semaphore is a protected variable whose value can be accessed and altered 
only by two primitive operations. A binary semaphore can take only the values 
0 or 1. The module SEM given below describes a binary semaphore as a simple 
object. The two primitives are described by two methods, setOn and setOff, 
and the attribute val returns the value of the semaphore, 
bth SEM is 

sort Sem . inc DATA . 
op val : Sem -> SemVal . 
ops setOn setOff : Sem -> Sem . 
var S : Sem . 
eq val(setOn(S) ) = 1 . 
eq val(setOff (S) ) = 0 . 
end 

We present the mutual exclusion considering a semaphore SEM working concur- 
rently with the two processes PRl and PR2. We define four methods, enterl, 
enter2, exitl, and exit2 implementing the previous procedures for each pro- 
cess. Since we consider only two processes, the queue may contain at most one 
blocked process and therefore it is useless. Testing the emptiness of the queue is 
equivalently with checking whether the other process is blocked, 
bth CS is 

inc (PRl I I PR2 I I SEM) * (sort Tuple to Cs) . 
op init : -> Cs . 

ops enterl enter2 exitl exit2 : Cs -> Cs . 
var C : Cs . 
eq enterl (C) = 

if (getSt (1*0 == idle) 

then if (val(3*C) == 1. SemVal) 

then <setSt(l*C, critical), 2*C, setOff (3*C)> 
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else <setSt(l*C, blocked), 2*C, 3*C> 
f i 

else C 
f i . 

*** enter2(C) is similar to enterl(C) 
eq exitl(C) = 

if (getSt(l*C) == critical) 

then if (getSt(2*C) == blocked) 
then <setSt(l*C, idle), 

setSt(2*C, critical), 3*0 
else <setSt(l*C, idle), 2*C, set0n(3*C)> 
fi 

else C 
f i . 

*** exit2(C) is similar to exitl(C) 
eq val(3*init) = l.SemVal . 
eq getSt (l*init) = idle . 
eq getSt (2*init) = idle . 
end 

The behavioral specification CS is a conservative extension of the concurrent 
connection PRl || PR2 || SEM. The role of the four new added derived operations is 
to synchronize the three concurrent objects in order to achieve a specific task. 



3 LTS and Bisimulation 

The structure of the object specifications allows to associate a labeled transition 
system (LTS) to each object model. The idea is to use the behavioral methods in 
r as actions that produce transitions and the attributes in F as queries asking 
for information about a given state. If B = (S, F, E) is an object specification, 
then we write F = Att{F) U Met{F), where Att{F) denotes the set of attributes 
of F and Met{F) denotes the set of methods of F. We make the assumptions 
that F = E\y L)Met{F) U Att{F) and yf 0 for each visible sort v. 

Definition 1. Given an object specification B = (E,F,E), an atomic T-query 
for hidden sort h is a term of the form q{-,zi,... , z„) with q G Att{F), _ a 
special variable of sort h, and Z\, . . . , Zn are pairwise distinct visible variables. 
We denote by AQr[- : h] the set of all atomic F -queries for h. An atomic In- 
action for h is a term a of the form g{-,ti, . . . ,tn), where g G Met{F) is a 
method and U is either an atomic F-query, or a term in Ti\v{Z) with Z a set 
of visible variables, or an element in D for i = l,n. We denote by AAr[- '■ h] 
the set of all atomic F -actions. 

The dynamic behavior of a system is given by the sequences of states pro- 
duced by an execution. Hidden algebra describes only partially this dynamics by 
providing means to investigate the behavior of the states under various experi- 
ments. We complete this partial approach by using the labeled transition system 
given by atomic actions over hidden models. The atomic actions correspond to 
the method calls. The presence of a query in an action models a communication 
between objects. 
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Definition 2. Given an object specification B = {U, F, E) and a B-model M , 
the labeled transition system defined by M is LTSr{M) = {Mh,AA[- : h],-^M), 
where h is the state sort of B and the transition relation -^m is given by: if a is 
an atomic action, then a — >m o! iff there is a variable assignment d : var{a) 
D{B) such that a' = |a]M (a) (■!?). 

The expression |a]M(a)(i?) means that the method designed by a in M is ex- 
ecuted over the state a with the actual arguments d{var{a)). The transition 
relation can be non-deterministic. 

An interesting problem is the relationship between the LTSs defined by two 
models of the same specification. This question can be answered using an ap- 
propriate bisimulation notion. 

Definition 3. Given an object specification B = {E,F,E), two B-models M 
and M' , and a relation R C Mh x Mf where h is the state sort of B, we say that 
R is a T-behavioral bisimulation between M and M' iff: 

1. it is a strong bisimulation between LTSr{M) and LTSr(M'), and 

2. aRa' implies \q\M{<i,di, . . . ,dn)=\q\M'W,di,... , d„) for each atomic F- 
query g(_, Z\, . . . , z„) and variable assignment {zi ^ di \ i = 1, n}. 

The second condition in Definition 3 expresses the behavioral nature of the bisim- 
ulation relation. The addition of this constraint preserves the main properties 
for the behavioral bisimulations. 

Proposition 1. Given an object specification B = (E,F,E), then: 

F the F-behavioral equivalence is a F-behavioral bisimulation; 

2. the inverse of a F-behavioral bisimulation is a F-behavioral bisimulation; 

3. the composition, the union, and the intersection of two F-behavioral bisim- 
ulations are F-behavioral bisimulations. 

Given an object specification B = (E,F,E) and two ,B-models M and M', 
then M M' iff there is a T-behavioral bisimulation R between M and M' . 

Proposition 2. Given an object specification B = {E,F,E), then «£ is an 
equivalence relation. 

If M M' , then denotes the largest T-behavioral bisimulation 

between M and M' and it is defined by: 

U I ^ ^ T-behavioral bisimulation between M and M'} . 

An interesting example of T-behavioral bisimulation is the following. Let 
B = {S,F,E) be an object specification with the state sort h. A composite 
F-query for h is coinductively defined as follows: 

- each atomic T-query is a composite T-query for h, and the only occurrence 
of _ is marked; 

- for each composite T-query c, for each atomic T-action g{_,ti, . . . ,tn), 
c[g{-,ti, . . . ,tn)] is a composite T-query that denotes the term obtained 
from c by replacing the marked occurrence of the special variable _ with 
(/(_, T, . . . ,tn), and the new occurrence of _ as argument of g is marked. 
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We denote by Qr[- ■ h] the set of all composite -T-queries for h. Given two B- 
models M, M' , the relation Mh x is defined as follows: a a' 

iff for any composite G-query c G Qr[- '■ h] and for any d : var{c) — l D{B), we 
have |c]M(a)(i^) = |c]M'(a')(^)- 

The definition of the relation is similar to that of the behavioral equiv- 

alence. In fact, we can prove that if M = M' then the behavioral bisimulation 
u r • 

—MM same with the behavioral equivalence over the state sort. 

Theorem 2. Given an object specification B = (if, F, E) with the state sort h 
and a B-model M , then the F -behavioral equivalence is the same with the F- 
behavioral bisimulation over the sort h, i.e. 

a =£ a' ijfi a states a, a' G Mh- 



Proposition 3. Given an object specification B = (if, T, E) and two B-models 
M and M' , is a bisimulation whenever it is nonempty. 

There are cases when the relation is empty (see [7]). However these 

cases are very rare in practice. In order to avoid them, we consider “well 
founded” object specifications B where is nonempty for any two ,B-models 

M and M' . 



4 CTL Models 



Gomputational Tree Logic (CTL) is a branching time logic, meaning that its 
model of time is a tree-like structure in which the future is not determined; 
there are different paths in the future, any one of which might be the “actual” 
path that is desired. We assume a fixed set Atoms of atomic propositions denoted 
hy p,q,r, ... . Following [3], the CTL formulas are inductively defined as follows: 
(j) ::= tt \ ff \ p \ {-^(jy) \ {fii ^ 4>2) \ | EQfi \ E [(/>! U</>2]. 

The intuitive meaning for each propositional connective is the usual one. 
The temporal connectives are pairs of symbols. The first is E, meaning “there 
Exists one path”. The second one is X, G, or U, meaning “neXt state”, “all future 
state (Globally)”, and “Until”, respectively. We can express other five operators, 
where A means “for All paths” : 

EF()i = E[ttU(^] AX (/) = ^EX (-.(/>) 

AG = -lEF (-.((i) AF (/) = -.EG (-.(/>) 

k[(ji U 4>2] = -.E[-.(()2 U {—'(ji A “..^ 2 )] A -.EG -<(j>2 
A model for CTL is a triple M = {S,—^,L) consisting of a set of states S 
endowed with a transition relation — >■ such that for every state s € S there is a 
state s' G S with s — > s' , and a labeling function L : S ^ P(Atoms). 

Let M = (S', — L) be a model for CTL. The satisfaction relation M,s\= (f), 
expressing that the state s G S satisfies in A4 the CTL formula (j), is inductively 
defined as follows: 




104 



D. Lucanu and G. Ciobanu 



1. A4, s 1= tt and A4, s ^ ff for all s G S. 

2. Ai,s 1= p iff p G L{s). 

3. 

4. M,s \= 4)1 /\ 4>2 M,s \= 4)\ and M,s \= 4>2- 

5. Ai, s \= EX4> iff for some si such that s — >■ si we have A4, si |= 4>. 

6. A4, s 1= EGf) iff there is a path s = sq — > si —> S 2 — . such that A4, Si\= 4> 

for alH = 0, 1, 2, 

7. Ai,s \= E[4>i U 4>2] iff there is a path s = Sq Si S2 ■ which satisfies 

the formula 4>i U 4^2, i-e. there is t G {0, 1, 2, ... } such that M, Si |= 4>2 and 

Ai,Sj ^ f’l for all j < i. 

We write s \= 4) for Ai,s \= 4> whenever M is understood from the context. In 
order to define CTL formulas for object specification, it is enough to define the 
specific CTL atomic propositions. 

Definition 4. Given an object specification B = {S, F, E) with the state sort h, 
a CTL atomic T-proposition for sort h is an equation 

q{-, di,. . . , dn) = d with q G Att{F) and d,di,. . . , (i„ G D{B). 

The intuitive meaning of an atomic T-proposition is that we obtain d whenever 
we execute a query q{_, d\,. . . , d„) over the current state. 

Definition 5. Given an object specification B = (T, T, E) with the state sort h 
and a B-model M , we say that a state a G Mh satisfies the atomic F -proposition 
g(_, di, . . . , d„) = d iff |(?]M(a, di, . . . , d„) = d. We denote by Lr{a) the set of 
atomic F -propositions satisfied by the state a. 



Example 2. Critical Region (continued) . Consider 

Att(T) = {getSt(l * _), getSt(2 * _), val(3 * _)} and 
a = (setSt(l * init, critical), 2 * init, setOff (3 * init)) 
expressing that the first process is entered in the critical section, the second 
process in the initial state, and the semaphore is off. Then 

Tr(a)={getSt(l * _)=critical, getSt(2 * _)=idle, val(3 * _)=0}. 

The LTS associated to ,B-models are used to build CTL models for B. 

Proposition 4. Given an object specification B = (E,F,E) with the state sort 
h having at least one method in F, and a B-model M, LTSr{M) together with 
the labeling function Lp form a GTL model. 

Theorem 3. Gonsider an object specification B = {E, F, E) with the state sort 
h, two B-models M and M' , a G M^, and a' G M^. If a m' then a and 
a' satisfy the same set of GTL formulas. 

A converse result does not hold because we consider only behavioral at- 
tributes in our CTL atomic propositions. If we also consider the behavioral 
methods, then we can say that two states are bisimilar iff they satisfy the same 
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set of CTL formulas, obtaining in this way a result similar to Proposition 4.8 in 
[8], 

Theorem 3 has an important meaning for the study of the temporal prop- 
erties. Given a model M and its sets of CTL formulas, we may transfer these 
sets of CTL formulas to any other model M as follows: for a state a' of M' , we 
search for a bisimilar state a of M and consider its set of CTL formulas. It is 
necessary to consider well founded specifications. 

We note that the set E of equations plays a minimal role for proving the tem- 
poral properties. Next section reveals the essential role played by E in finding a 
canonical CTL model suitable for a model checking algorithm. 

5 Model Checking for Object Specification 

Model checking [3] is an automatic technique for verifying properties of finite- 
state systems. The properties are expressed using temporal logic, e.g. CTL, and 
the system is modeled by a state-transition graph. Then an efficient search pro- 
cedure is used to check if the formulas expressing the system properties are 
satisfied by the corresponding transition graph. 

CTL can express temporal behavioral properties of systems described by 
object specifications. In this section we investigate how we can associate a state- 
transition graph to an object specification such that the model checking proce- 
dure can be applied to verify the system. The temporal properties of a system 
can be expressed starting from an initial state, i.e. the state whose temporal 
properties we are searching for. This justifies the following definition. 

Definition 6. Given an object specification B = {E, E, E) with the state sort h, 
and a B-model M , a state a G is initial ijj there is a (generalized) hidden 
constant t such that a = PJm fa = PlMfdi, ■ • ■ : d„) for some di, . . . ,d„ G D if 
t is generalized) . We say that M satisfies a CTL formula and write M \= (L 
iff M, a \= <P for any initial state a G Mh, where h is the state sort of B. We say 
that M satisfies a set E of CTL formulas and we write M \= E iff M \= (L for 
each <P € E. Finally, we say that B satisfies a set E of CTL formulas and we 
write B \= E iff M \= E for each B-model M. 

Example 3. Critical Region (continued). A temporal specification for the criti- 
cal region may include: 

- there is no state such that both process execute the critical section: 

AG(->(getSt(l*_) = critical A getSt(2*_) = critical)) 

- if no process execute the critical section, then both processes are idle: 

AG(val(3*_) = 1 — >■ (getSt(l*_) = idle A getSt(2*_) = idle)) 

We are interested in finding the temporal properties of an object specification. 
BOBJ is not able always to check whether these properties are satisfied. If we 
build a model for the specification B, then we can apply a model checking algo- 
rithm over the LTS defined by the model. The yB-models are admissible imple- 
mentations and the construction of such a model is a tedious task. In this section 
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we propose a solution which can be successfully applied in many practical cases. 
We build a CTL model KS{B) directly from the specification, and this step is 
done by a software tool. The temporal properties satisfied by any ,B-model M 
are the same with those satisfied by KS{B) whenever some assumptions are 
satisfied. Once we have the CTL model KS{B), we can use the existing model 
checker SMV in order to check the desired temporal properties. We start by 
giving the construction of the CTL model KS{B). 

Assumptions. We consider only behavioral specifications B = {S, F, E) with 
the the state sort h and having the following properties: 

1. If V is the result sort of an attribute q G Att{F), then D{B)y is finite. 

2. Any ground term t of sort h is defined, i.e. for each atomic T-query 

(_, di, . . . , dn) with di G D, there is d G D such that 

B ^ q(f^ di , . . . , dn^ — d. 

The definition for defined term is different from that used in [4] because we 
consider only atomic queries instead of arbitrary contexts. 

An abstract state S for B is constructed as follows: 

- we start with S' = 0, and 

- for each q G Att{F) and appropriate d\,... ,dn,d G D, we add the atomic 
CTL proposition g(_, d\, . . . , dn) = d to S. 

According to our assumptions, the set of all abstract states is finite. The 
transition relation over the abstract states is defined by: S S' iff for every 

,B-model M and a G Mh, if a ^ AseS ^ ® ^ then a! ^ t\s^s' 

Even if this definition is of semantic nature, we can use the BOBJ system to 
find out these transitions. Given an abstract state S and an action a, then S' is 
constructed according to the following steps: 

1. Consider a hidden constant a of sort h. This is expressed in BOBJ by: 

op a : -> h . 

2. For each CTL-proposition q{_,di,... ,dn) = d in S, we define in BOBJ an 
equation of the form: 

eq q(a, di, . . . ,dn) = d . 

3. For each q G Att(F) and for each appropriate di,... ,d„ we can compute 
q(a(a),di, . . . , d„) using the BOBJ operational semantics: 

red q(a(a), di,... , d„) . 

We give as an example the BOBJ code that computes a transition for the critical 
section. Let a be the state such that the first process is entered in the critical 
section, the second is waiting for entering (blocked) and the semaphore value is 
zero. We assume that we want to compute the state obtained by the execution 
of the method exitl(_). The BOBJ code is: 
open . 

op a : -> Cs . 
eq val(3*a) = O.SemVal . 
eq getSt(l*a) = critical . 
eq getSt(2*a) = blocked . 
red getSt (l*exitl (a) ) . 
red getSt (2*exitl (a) ) . 
red val (3*exitl (a) ) . 
close 
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BOBJ system provides the following output: 

reduce in CS : getStCl* exitl(a)) 
result ProcStatus: idle 

reduce in CS : getSt(2* exitl(a)) 
result ProcStatus: critical 



reduce in CS : val(3* exitl(a)) 
result SemVal : 0 

We see that we obtained the expected results: the first process became idle, 
the second is entered in the critical section, and the semaphore value remains 
unchanged. 

Given a ,B-model M, a state a G is reachable wrt B iff there is a ground 
term t of sort h such that \1\m = a. By RLTSr{M) we denote the subsystem of 
LTSr{M) induced by the subset of the reachable states. The set of all reachable 
abstract states together with the transition relation and the labeling function 
L{S)=S form a canonical LTS denoted by KS{B). 

Theorem 4. Given an object specification B = (if, F, E) satisfying our assump- 
tions and a B-model M , then KS{B) is a CTL model and for each reachable 
a G Mfi there is an abstract state Sa in KS{B) such that a and Sa satisfy the 
same set of CTL formulas. 



Corollary 1. For any two B-models M and M' , the relation — x 
defined by a ^ff 

lqjM{a, di... ,dn) = |g]M'(a', di ■ • ■ , d«) for any atomic F-query 
g(_, zi, . . . , z„) and variable assignment {zi di \ i = 1, n}, 
is a F -behavioral bisimulation between RLTS{M) and RLTS(M'). 

We briefly present a procedure implemented by Mihai Dane§ which produces 
an SMV description of the CTL model KS{B) having as input an object speci- 
fication B.li B does not satisfy our assumptions then the procedure stops pro- 
ducing an error message. Once we have the SMV description of the CTL model, 
we use the SMV system to check the temporal specification of the system. The 
procedure works as follows: 

1. Define an SMV variable for each query g(_, d\, . . . , d„). In this way, a state 
in KS{B) is completely described by the set of the corresponding variable 
values. 

2. Use BOBJ to compute the variables values for the initial states. 

3. For each nonprocessed state, the tool uses BOBJ to compute the set of 
next states as above and adds the SMV description of each new computed 
transition to the output. This process is finite due to the assumptions we 
consider. If BOBJ fails to compute the next state, then the tool stops with 
failure (the second assumption is not satisfied). 
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6 Conclusion 

This paper nontrivially extends the existing theory of hidden algebra: 

1. It defines the notion of ’’bisimulation” between two models, which is im- 
portant in order to say when two states behaviorally satisfy the same CTL 
properties. This notion generalizes the already existing notion of behavioral 
equivalence. 

2. It gives an automatic procedure to extract a CTL model from special cases 
behavioral specifications. Behavioral satisfaction is a very hard problem in its 
generality so it is not surprising that we had to constrain the behavioral 
specifications from which we can extract CTL models automatically. 

3. For the first time, a model checking procedure is given for behavioral speci- 
fications (in hidden logic) of object systems, by employing the SMV model 
checker. 

In order to apply the model checking procedures, we present a construction of 
a canonical CTL model directly from the object specification, i.e. independent 
of model. These ingredients provide a formal engineering environment for object 
oriented paradigm covering both (algebraic) equational features, and temporal 
aspects. 

We investigate how to weaken the constrains imposed over object specifica- 
tions in order to produce canonical CTL-models. A challenging problem is to 
develop a framework for model checking general behavioral specifications able 
to specify infinite state systems and having multiple hidden types and methods 
with more than one hidden argument. 
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Abstract. Polygonal hybrid systems are a subclass of planar hybrid 
automata which can be represented by piecewise constant differential 
inclusions. Here, we identify and compute an important object of such 
systems’ phase portrait, namely invariance kernels. An invariant set is a 
set of initial points of trajectories which keep rotating in a cycle forever 
and the invariance kernel is the largest of such sets. We show that this 
kernel is a non-convex polygon and we give a non-iterative algorithm for 
computing the coordinates of its vertices and edges. Moreover, we present 
a breadth-first search algorithm for solving the reachability problem for 
such systems. Invariance kernels play an important role in the algorithm. 



1 Introduction 

A hybrid system is a system where both continuous and discrete behaviors in- 
teract with each other. A typical example is given by a discrete program that 
interacts with (controls, monitors, supervises) a continuous physical environ- 
ment. In the last decade many (un)decidability results for a variety of prob- 
lems concerning classes of hybrid systems have been given [AGH+95,ABDM00, 
BTOO], [DM98,GM99,KV00]. One of the main research areas in hybrid systems 
is reachability analysis. Most of the proved decidability results are based on the 
existence of a finite and computable partition of the state space into classes 
of states which are equivalent with respect to reachability. This is the case for 
timed automata [AD94], and classes of rectangular automata [HKPV95] and 
hybrid automata with linear vector fields [LPY99]. For some particular classes 
of two-dimensional dynamical systems a geometrical method, which relies on 
the analysis of topological properties of the plane, has been developed. This 
approach has been proposed in [MP93]. There, it is shown that the reachabil- 
ity problem for two-dimensional systems with piece-wise constant derivatives 
(PGDs) is decidable. This result has been extended in [GV96] for planar piece- 
wise Hamiltonian systems and in [ASYOl] for polygonal hybrid systems, a class 
of nondeterministic systems that correspond to piecewise constant differential in- 
clusions on the plane, see Fig. 1(a). For historical reasons we call such a system 
an SPDI [Sch02]. In [AMP95] it has been shown that the reachability problem 
for PGDs is undecidable for dimensions higher than two. 

* Supported by European project ADVANCE, Contract No. IST-1999-29082. 
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Fig. 1. (a) An SPDI and its trajectory segment; (b) Reachability analysis of the SPDI 



Another important issue in the analysis of a (hybrid) dynamical system is the 
study of its qualitative behavior, namely the construction of its phase portrait. 
Typical questions one may want to answer include: “does every trajectory (ex- 
cept for the equilibrium point at the origin) converge to a limit cycle?”, and 
’’what is the biggest set such that any point on it is reachable from any other 
point on the set?” . There are very few results on the qualitative properties of tra- 
jectories of hybrid systems [ASY02,Aub01,DV95,KV96,KdB01,MS00,SJSL00]. 
In particular, the question of defining and constructing phase portraits of hy- 
brid systems has not been directly addressed except in [MSOO], where phase 
portraits of deterministic systems with piecewise constant derivatives are ex- 
plored and in [ASY02] where viability and controllability kernels for polygonal 
differential inclusion systems have been computed. 

In this paper we show how to compute another important object of phase por- 
traits of SPDIs, namely the invariance kernel. In general, an invariant set is a 
set of points such that every trajectory starting in the set remains within the 
set forever and the invariance kernel is the largest of such sets. We show that, 
for SPDIs, this kernel for a particular cycle is a non-convex polygon and we 
give a non-iterative algorithm for computing the coordinates of its vertices and 
edges^ . Clearly, the invariance kernel provides useful insight about the behavior 
of the SPDI around simple cycles. Furthermore, we present an alternative algo- 
rithm to the one presented in [ASYOl] for solving the reachability problem for 
SPDIs. This algorithm is a breadth-first search, in the spirit of traditional model 
checking algorithms. Invariance kernels play a key role in the algorithm. 



2 Preliminaries 

2.1 Truncated AfRne Multivalued Functions 

A (positive) affine function / : K — >■ K is such that f{x) = ax + b with a > 0. 
An affine multivalued function F : K — 2®, denoted F = (fi,fu), is defined by 
F{x) = {fi{x), fu{x)) where fi and /„ are affine and (•, •) denotes an interval. In 



^ Notice that since SPDIs are partially defined over the plane, their invariance kernels 
are in general different from the whole plane. 
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what follows we will consider only well- formed intervals, i.e. {l,u) is an interval 
iS I < u. For notational convenience, we do not make explicit whether intervals 
are open, closed, left-open or right-open, unless required for comprehension. 
For an interval I = (l,u) we have that F{{l,u)) = {fi{l),fu{u))- The inverse 
of F is defined by F~^{x) = {y \ x £ F{y)}. It is not difficult to show that 
F~^ = The universal inverse of F is defined by F~^{I) = I' iff /' is 

the greatest non-empty interval such that for all x G I', F{x) C /. Notice that 
if / is a singleton then F~^ is defined only if /; = /«. These classes of functions 
are closed under composition. 

A truncated affine multivalued function (TAMF) .7^ : K — >• 2* is defined by an 
affine multivalued function F and intervals S C K+ and J C K+ as follows: 
F(x) = F{x) n J if a; G S', otherwise F{x) = 0. For convenience we write 
F{x) = A({a;} fl S) fl J. For an interval I, F{I) = F{I (1 S) (1 J and F~^{I) = 
F~^{I n J) nS. We say that F is normalized if S = DomF = {x \ F{x)C\ J ^ Ih} 
(thus, S C F~^{J)) and J = ImiF = F{S). In what follows we only consider 
normalized TAMFs. The universal inverse of T is defined by T~^(F) = F iff 
/' is the greatest non-empty interval such that for all x G I', F{x) C / and 
F{x) = F{x). 

TAMFs are closed under composition [ASYOl]: 

Theorem 1. The composition of two TAMFs F\{I) = Fi(I fl Si) fl Ji and 
F 2 {I) = F 2 (/nS 2 )nJ 2 , is the TAMF {F 2 oFi){I) = F{I) = F{Ir^S)r^J, where 
F = F '2 o Ai, S = Si n Ff i(Ji n S 2 ) and J = J 2 n F 2 (Ji n S 2 ). 



2.2 SPDI 

An angle on the plane, defined by two non-zero vectors a, b is the set of all 
positive linear combinations x = a a -|- /3 b, with a,(3 > 0, and a + (3 > 0. We 
can always assume that b is situated in the counter-clockwise direction from a. 
A polygonal differential inclusion system (SPDI) is defined by giving a finite 
partition P of the plane into convex polygonal sets, and associating with each 
P G P a couple of vectors ap and bp. Let (f>{P) — The SPDI is x G 4>{P) 
for X G P. 

Let E{P) be the set of edges of P. We say that e G E{P) is an entry of P if for 
all X G e and for all c G <()(P), x-f ce G P for some e > 0. We say that e is an exit 
of P if the same condition holds for some e < 0. We denote by In{P) C E{P) 
the set of all entries of P and by Out{P) C E{P) the set of all exits of P. 

Assumption 1 All the edges in E{P) are either entries or exits, that is, E{P) = 
In{P) U Out{P). 



Example 1. Consider the SPDI illustrated in Fig. 1(a). For each region 1 < 
i < 8, there is a pair of vectors (a^bj), where: ai = bi = (1,5), a 2 = b 2 = 
(-l,i), aa = (-l,|n) and bg = (-1,-A), a4 = b4 = (-1,-1), ag = bg = 
(0,-1), ag = bg = (1,-1), a7 = b7 = (1,0), ag = bg = (1, 1). ■ 
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A trajectory segment of an SPDI is a continuous function ^ : [0,T] — >■ which 

is smooth everywhere except in a discrete set of points, and such that for all 
t G [0, T], if ^(t) G P and ^(t) is defined then ^{t) G The signature, denoted 
Sig(^), is the ordered sequence of edges traversed by the trajectory segment, that 
is, 61 , 62 , . . ., where ^{U) G 6 j and U < U+i. If T = 00 , a trajectory segment is 
called a trajectory. 

Assumption 2 We will only consider trajectories with infinite signatures. 



2.3 Successors and Predecessors 



Given an SPDI, we fix a one-dimensional coordinate system on each edge to 
represent points laying on edges [ASYOl]. For notational convenience, we will use 
6 to denote both the edge and its one-dimensional representation. Accordingly, 
we write x G e or a; G 6 , to mean “point x in edge e with coordinate x in the 
one-dimensional coordinate system of c” . The same convention is applied to sets 
of points of 6 represented as intervals (e.g., x G / or a; G /, where / C e) and to 
trajectories (e.g., starting in x" or starting in x”). 

Now, let P G P, 6 G In{P) and e' G Out{P). For I C e, SucCee'(I) is the set of all 
points in e' reachable from some point in / by a trajectory segment ^ : [ 0 , t] — >■ 
in P (i.e., ^(0) G / A ^(t) G e' A Sig(^) = ee'). We have shown in [ASYOl] that 
SucCee' is a TAMF^. 



Example 2. Let 61 ,..., 6 g be as in Fig. 1(a) and I = [l,u]. We assume a one- 
dimensional coordinate system such that Ci = Si = Ji = (0, 1). We have that: 



P 



ei6i+ 



Peie,(/)= 

AI)=I i<i<l Fe,eAl) = 



i,f] Fe,eAI) = 



I L 7/ 4- ii 

10’ “ ^ fin 

I + 5, M + 5 



with SucCe^ei+i (-1) = Pei 6 i+i(d H Si) fl Ji+ 1 , for 1 < I < 7, and SucCegei(d) = 



-eiei+i 

Pegei(/nS'8) nJi. 



Given a sequence w = 6 i, 62 , . . . , e„. Theorem 1 implies that the successor of I 
along w defined as SucCu,(/) = SucCe„_ie„ o . . . o SucCeie 2 (^) is a TAMF. 



Example 3. Let a = e\- 



■ 6861 . We have that SucCct(7) = F{I fl S') fl J, where: 
F{I) = [5 + ^ u , 



10 ’ 2 



60 J 



S = (0, 1) and T = (g, ||) are computed using Theorem 1. 



For I C e', Preee'(/) is the set of points in e that can reach a point in / by a 
trajectory segment in P. We have that[ASY01]: Preee' = Succ“^ and PrCo- = 
Succ~^. 



Example j. Let cr = 61 . . .6861 be as in Fig. 1(a) and I = [l,u\. We have that 
Pree,e,+i(/) = PpAi+iA LI Jt+i) LI Si, for 1 < i < 7, and Preegei(^) = PfiAiA LI 
Ji) n S 8 , where: 



F-Afil) = [2l,2u] F~AAI) = 
Fe~l^AI) = I 3<i<7 P-i^(/) = 
Besides, PreCT(/) = F~^{I (1 J) (1 S, where F~^{I) 



\l- 11 u+F 
I 60’ “ ^ in 

I — M — I 
= m-foAu- 



^ In [ASYOl] we explain how to choose the positive direction on every edge in order 
to guarantee positive coefficients in the TAMF. 
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2.4 Qualitative Analysis of Simple Edge-Cycles 

Let cr = Cl • • • Cfcei be a simple edge-cycle, i.e., ^ ej for all 1 < i yf j < fc. Let 

SucCcr(I) = F{I n S') n J with F = {fi, /„) (we suppose that this representation 
is normalized). We denote by T>a- the one-dimensional discrete-time dynamical 
system defined by SucCo-, that is Xn+i G SucCo-(a;„). 

Assumption 3 None of the two functions fi, fu is the identity. 

Let I* and u* be the fixpoints^ of // and /„, respectively, and Sfl J = {L,U). 
We have shown in [ASYOl] that a simple cycle is of one of the following types: 

STAY. The cycle is not abandoned neither by the leftmost nor the rightmost 
trajectory, that is, L <1* <u* <U. 

DIE. The rightmost trajectory exits the cycle through the left (consequently the 
leftmost one also exits) or the leftmost trajectory exits the cycle through the 
right (consequently the rightmost one also exits), that is, u* < LV I* > U. 
EXIT-BOTH. The leftmost trajectory exits the cycle through the left and the 
rightmost one through the right, that is, I* < L Au* > U. 

EXIT-LEFT. The leftmost trajectory exits the cycle (through the left) but the 
rightmost one stays inside, that is, I* < L < u* < U. 

EXIT-RIGHT. The rightmost trajectory exits the cycle (through the right) 
but the leftmost one stays inside, that is, L < I* < U < u* . 

Example 5. Let a = ei • • • egei. We have that S C\ J = {L,U) = (|, ||). The 
fixpoints of the equation in example 3 are such that L = I* = | < u* = || < U. 
Thus, cr is STAY. ■ 

The classification above gives us some information about the qualitative behav- 
ior of trajectories. Any trajectory that enters a cycle of type DIE will eventually 
quit it after a finite number of turns. If the cycle is of type STAY, all trajectories 
that happen to enter it will keep turning inside it forever. In all other cases, some 
trajectories will turn for a while and then exit, and others will continue turn- 
ing forever. This information is very useful for solving the reachability problem 
[ASYOl]. 

Example 6. Consider again the cycle a = Ci • • • egCi. Fig. 1(b) shows the reach 
set of the interval [0.95, 1.0] C ei. Notice that the leftmost trajectory “converges 
to” the limit C = |. Fig. 1(b) has been automatically generated by the SPeeDI 
toolbox [APSY02] we have developed for reachability analysis of SPDIs. ■ 

The above result does not allow us to directly answer other questions about 
the behavior of the SPDI such as determine for a given point (or set of points) 
whether any trajectory (if it exists) starting in the point remains in the cycle 
forever. In order to do this, we need to further study the properties of the system 
around simple edge-cycles and in particular STAY cycles. See [Sch03] for some 
important properties of STAY cycles. 

® Obviously, the fixpoint x* is computed by solving a linear equation f{x*) = x* , 
which can be finite or infinite (see Lemma 6, page 45 of [Sch02]). 
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3 Invariance Kernel 

In this section we define the notion of invariance kernel and we show how to 
compute it. In general, an invariant set is a set of points such that for any point 
in the set, every trajectory starting in such point remains in the set forever and 
the invariance kernel is the largest of such sets. 

In particular, for SPDI, given a cyclic signature, an invariant set is a set of 
points which keep rotating in the cycle forever and the invariance kernel is the 
largest of such sets. We show that this kernel is a non-convex polygon (often 
with a hole in the middle) and we give a non-iterative algorithm for computing 
the coordinates of its vertices and edges. 

In what follows, let KT C We recall the definition of viable trajectory. A 
trajectory ^ is viable in K if ^{t) G K for all t > 0. AT is a viability domain if 
for every x G K, there exists at least one trajectory with ^(0) = x, which is 
viable in K. 

Definition 1. We say that a set K is invariant if for any x G K such that there 
exists at least one trajectory starting in it, every trajectory starting in x is viable 
in K. Given a set K, its largest invariant subset is called the invariance kernel 
of K and is denoted by \x\y{Krj). ■ 

We denote by T>a the one-dimensional discrete-time dynamical system defined 
by SucCct, that is Xn+i G SucCcr(a:„). The concepts above can be defined for T>,y, 
by setting that a trajectory xqXi ... of Do- is viable in an interval I Q if Xi G I 
for all i > 0. Similarly we say that an interval I in an edge e is invariant if any 
trajectory starting on xq G I is viable in I. 

Before showing how to compute the invariance kernel of a cycle, we give a char- 
acterization of one-dimensional discrete-time invariant. 

Lemma 1. For T>„ and a a STAY cycle, the following is valid. If I is such 
that F{I) C I and F{I) = fF{I) then I is invariant. On the other hand if I is 
invariant then F{I) = fF{I). 

Proof: Suppose that F{I) = T{I) and F{I) C /, then T{I) C /, thus by 
definition of STAY and monotonicity of T, we know that for all n, C I. 

Hence I is invariant. Let suppose now that I is invariant, then for any trajectory 
starting on xq G I, xqXi ... is in / and trivially F{I) = T{I). □ 

Given two edges e and e' and an interval I Q e' we define the V-predecessor 
Pre(7) in a similar way to Pre(7) using the universal inverse instead of just the 
inverse: for 7 C e', Preee'(7) is the set of points in e such that any successor of 
such points are in 7 by a trajectory segment in P. We have that Preee' = Suc^g, 
and Precr = Suc^, . 

Theorem 2. ForT>,j, if a = ei . . . e„ei is STAY then Inv(ei) = Preo-(T), else 
Inv(ei) = 0. 

Proof: That Inv(ei) = 0 for any type of cycle but STAY follows directly from 
the definition of each type of cycle. 
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Let us consider a STAY cycle with signature a. Let Ik = = Pre^iJ)- 

We know that F{T~^{J)) = and by STAY property, C 

thus by Lemma 1 we have that Ik is invariant. We prove now that Ik is 
indeed the greatest invariant. Let suppose that there exists an invariant H Q S 
strictly greater than Ik- By assumption we have that Ik = F~^{J) C H , then 
by monotonicity of F, F{F~^{J)) C F{H) and since F{F~^{J)) = J® we have 
that J C F{H), but this contradicts the monotonicity of F since J = F{S) C 
F{H) and then S C H which contradicts the hypothesis that H C S. Hence, 
Inv(ei) = Precr(J). □ 

The invariance kernel for the continuous-time system can be now found by prop- 
agating Pre(J) from ci using the following operator. The extended \/ -predecessor 
of an output edge e of a region R is the set of points in R such that every trajec- 
tory segment starting in such point reaches e without traversing any other edge. 
More formally. 

Definition 2. Let R be a region and e he an edge in Out{R). The e-extended 
V-predecessor of I, Pree(/) is defined as: 

Prefil) = {x I . (^(0) = X ^ > 0 . (f{t) G / A Sig(C[0,t]) = e))}. ■ 

The above notion can be extended to cyclic signatures (and so to edge-signatures) 
as follows. Let ct = ei, . . . , be a cyclic signature. For I C ei, the a-extended 

'i -predecessor of I, Precr(/) is the set of all x G for which any trajectory 
segment f starting in x, reaches some point in I, such that Sig(^) is a suffix of 

e2...efcei. ^ 

It is easy to see that Pre„{I) is a polygonal subset of the plane which can be 
calculated using the following procedure. First compute Preej(/) for all 1 < i < n 
and then apply this operation k times: Preo-(/) = U^=i ^’^^eiili), with Ii = /, 

Ik = Pree^eAh) and R ^ Pree^e^+i for 2 < i < A: - 1. 

Now, let define the following set: 

Kcr = U G) 

where Pi is such that e^-i G In{Pi), Ci G OufiPf) and int(Pi) is the interior of 

P^. 

We can now compute the invariance kernel of K„. 

Theorem 3. If a is STAY then Inv(A'cr) = Precr(PreCT( J)), otherwise Inv(A'cr) = 

0 . 



Proof: Trivially \ny{Kfi) = 0 for any type of cycle but STAY. That \ny{KA = 
Pre£,(Precr(T)) for STAY cycles, follows directly from Theorem 2 and definition 
of Pre. □ 

Example 7. Let a = ei . . .egCi. Fig. 2 depicts: (a) K„, and (b) PreCT(Precr(T)) ■ 

See Lemma 13 in [Sch03] for a proof. 

® See Lemma 12 in [Sch03] for a proof. 
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Fig. 2. Invariance kernel. 



4 Reachability Algorithms for SPDIs 

4.1 Previous Algorithm [ASYOl] 

The decidability proof of [ASYOl] already provides an algorithmic way of decid- 
ing reachability in SPDIs, which was implemented in our tool SPeeDI [APSY02]. 
We will give an overview of the algorithm to be able to compare and contrast 
it with the new algorithm that we are proposing. The decidability proof is split 
into three steps: 

1. Identify a notion of types of signatures, each of which ‘embodies’ a number 
of signatures through the SPDI. 

2. Prove that a finite number of types suffice to cover all edge signatures. Fur- 
thermore, given an SPDI, this set is computable. 

3. Give an algorithm which decides whether a given type includes a signature 
which is feasible under the differential inclusion constraints of the SPDI. 

We will not go into the details (see [ASYOl] for more details), but will outline a 
number of items which will allow us to compare the algorithms. 

Definition 3. A type signature is a sequenee of edge signatures with alternating 
loops: risfr 2 S 2 ■ ■ ■ sf^rn+is* . The ri parts of the type signature are ealled the 
sequential paths while the Si parts ealled iteration paths. The last iteration path 
s is always a STAY loop. The interpretation of a type is similar to that of regular 
expressions: 

signatures{risf r 2 S^ . . . s+r^+is*) = {ris^ V 2 S 2 ^ . . . | fci > 0, A: > 0} ■ 

In [ASYOl], one can find details of how to decide whether a given type signature 
includes an edge signature which is feasible. Clearly, given a source edge Cs 
and a destination edge e/, there potentially exists an infinite number of type 
signatures from to e/. To reduce this to a finite number, [ASYOl] applies a 
number of syntactic constraints which ensure finiteness, but do not leave out any 
possibly feasible edge signatures. Using these constraints it is easy to implement 
a depth-first traversal of the SPDI to check all possible type signatures. Note 
that a breadth-first traversal would require excessive storage requirements of all 
intermediate nodes. 
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Fig. 3. The edge-graph of the swimmer SPDI example 



From our experience in using SPeeDI, our implementation of this algorithm, 
the main deficiency of this approach is that incorrect systems which may have 
‘short’ counter-examples (in terms of type signature length) end up lost in the 
exploration of long paths — either taking an excessive amount of time to find 
the counter-example, or coming up with a long counter-example difficult to use 
for debugging the hybrid system. Ideally, we should be able to find a shortest 
counter-example without the need of exhaustive exploration of the SPDI. 

4.2 A Breadth-First Algorithm 

As is evident from the previous section, it is desirable to have a breadth-first 
algorithm to be able to identify shortest® counter-examples and be able to use 
standard algorithms for optimisation. 

Definition 4. The edge graph of an SPDI with partition P is the graph with 
the region edges as nodes: N = Upgpif(P); and transitions between two edges 
in the same partition with the first being an input and second an output edge: 
T = {(e, e') I 3P G P . e, e' € P, e G In{P), e' G Out{P)}. ■ 

Example 8. To illustrate the notion of an edge-graph. Fig. 3 illustrates the edge- 
graph corresponding to the SPDI representing the swimmer example given in 
Fig. 1(a). 

Definition 5. The meta-graph of an SPDI S is its edge graph augmented with 
the loops in the SPDI: 

1. An unlabelled transition for every transition in the original graph: {e — ^ 
e' I input edge e and output edge e' belong to the same region } 

2. A set of labelled transitions, one for each simple loop in the original graph 

se 

which is eventually left: {e ‘D e' | head(s) ^ e! , esee' is a valid path in S'} 

3. A set of labelled sink edges, one for each simple loop of type STAY which is 

se 

never left: {e O | ese is a valid path in S, se is a STAY loop}. ■ 

® Note that shortest, in this context, is not in terms or length of path on the SPDI, or 
number of edges visited, but on the length of the abstract signature which includes 
a counter-example. 
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Note that reachability along a path through the meta-graph corresponds to that 

Si 

of a type signature as defined in the previous section. For example ei — >■ 62 

S2 I 

63 O would correspond to 616281 ess^- From the result in [ASYOl], which states 
that only a finite number of abstract signatures (of finite length) suffices to 
describe the set of all signatures through the graph, it immediately follows that 
for any SPDI, it suffices to explore the meta-graph to a finite depth to deduce 
the reachable set. 

Proposition 1 . Given an SPDI S, reachability in S is equivalent to reachability 
in the meta-graph of S. □ 

To implement the meta-graph traversal, we will define the functions correspond- 
ing to the different transitions which, given a set of edge-intervals already visited, 
return a new set of edge-intervals which will be visited along that transition: 

— >■ (E) = {SucCee'(t) \ i (z E, i Q e, e —>■ e'} 

(E) = {SucCp(t) \ i € E, i C e, e e', p prefix ecr+e'} 

O (E) = {SucCp(f) n Inv(A'cr) \ i & E, i C e, e O, p prefix ea*}. 

Note that, using the techniques developed in [ASYOl], we can always calculate 
the first two of the above. Furthermore, we are guaranteed that if E consists of 
a finite number of edge-intervals, so will — >■ (E) and {E). Unfortunately, this 
is not the case with O (E). However, it is possible to compute whether a given 
set of edge-intervals and O (E) {E being a finite set of edge-intervals) overlap. 
If we consider the standard model checking approach, we can use a given SPDI 
with transitions — >■, meta-transitions S-», sink-transitions O and initial set I: 

Rq = I Rn+l = RnG — >■ (i?„)u (i?„)U O (Rn) 

We can terminate once nothing else is added: Rn-\-i = Rn- Edge-interval sets 
Rn can be represented enumeratively. However, as already noted, STAY loops 
represented by sinks may induce an infinite number of disjoint intervals. However, 
since sinks are dead end transitions, we can simplify the reachability analysis by 
performing the sinks only at the end: 

Rn+l — RnG — >■ (i?„)U S-> (Rn) 

Since the termination condition depended on the fact that we were also applying 
the sink transitions O, we add this when we check the termination condition: 
RnG O (Rn) = Rn+iG O (Rn+i)- Although the problem has been simply 
moved to the termination test, we show that this condition can be reduced to 
the simpler, and testable: Rn-\-i \Rn U Inv, where Inv is the set of all invariance 
kernels [J^^stay The proof of correctness of the algorithm can be found 

in [Pac03]. 

We can now implement the algorithm in a similar manner as standard forward 
model checking: 

preR := {}; R := Src; 
while (R \ preR % Inv) 

preR := R; R := RU — > (R)U "U (R) ; 
if (Dst overlaps R) then return REACHABLE; 
if (Dst overlaps (R U O (R) ) ) 

then return REACHABLE else return UNREACHABLE; 
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5 Conclusion 

One of the contributions of this paper is an automatic procedure to obtain an 
important object of the phase portrait of simple planar differential inclusions 
(SPDIs [Sch02]), namely invariance kernels. 

We have also presented a breadth-first search algorithm for solving the reacha- 
bility problem for SPDIs. The advantage of such an algorithm is that it is much 
simpler than the one presented in [ASYOl] and it reminds the classical model 
checking algorithm for computing reachability. Invariance kernels play a crucial 
role to prove termination of the BPS reachability algorithm. 

We intend to implement the algorithm in order to empirically compare it with 
the previous algorithm for SPDIs [APSY02]. 



Acknowledgments. We are thankful to Eugene Asarin and Sergio Yovine for 
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Abstract. A reactive system does not terminate and its behaviors 
are typically defined as a set of infinite sequences of states. In formal 
verification, a requirement is usually expressed in a logic, and when 
the models of the logic are also defined as infinite sequences, such as 
the case for LTL, the satisfaction relation is simply defined by the 
containment between the set of system behaviors and that of logic 
models. However, this satisfaction relation does not work for interval 
temporal logics, where the models can be considered as a set of finite 
sequences. In this paper, we observe that for different interval based 
properties, different satisfaction relations are sensible. Two classes of 
properties are discussed, and accordingly two satisfaction relations are 
defined, and they are subsequently unified by a more general definition. 
A tool is developed based on the Spin model checking system to verify 
the proposed general satisfaction relation for a decidable subset of 
Discrete Time Duration Calculus. 

Keywords: model checking, finitary property, reactive system, interval 
temporal logic 



1 Introduction 

A reactive system does not terminate and its semantics is typically defined as 
the set of infinite sequences of states generated by the execution of the system. 
In formal verification, a requirement is usually expressed in a logic. A popular 
specification logic for reactive systems is LTL, and it is not by chance that a 
model of an LTL formula is also an infinite sequence of states. In this setting, 
the satisfaction relation is simply the containment between the set of system 
behaviors and that of logic models. 
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Another class of temporal logic is interval based. Although interval logics are 
less widely used compared to LTL, there are properties that are easier to express 
in such logics [14]. In almost all the interval based logics, intervals are finite 
ones, corresponding to finite sequences of states, which makes set containment 
no longer an appropriate satisfaction relation when these logics are used as the 
specification logic. One possible definition for this satisfaction relation is that 
a reactive system satisfies such a property iff all the finite prefixes of all the 
infinite behaviors satisfy the property, i.e. all the prefixes must be models of the 
requirement formula. However, while this definition makes sense in some cases, 
it does not in some other cases. This depends on the kind of properties. Take a 
mutual exclusion algorithm as an example, there are two well-known desirable 
properties: no two processes can be in critical sections at the same time, and a 
process should be allowed to enter the critical section eventually. An attempt to 
express these two properties in the Interval Temporal Logic (ITL) [10] or one of 
its variants may result in the following two formulas: 

— □-i(azn A bin): a model of it is a finite interval such that ain and bin are 
never true together in any of the sub-interval; 

— •Oain: a model of it is a finite interval such that ain is true in one state. 

For □-'(azn A bin), all the execution prefixes of a mutual exclusion algorithm 
should be its models, and if all the execution prefixes are models of the formula, 
intuitively the first requirement is satisfied. This is due to the fact that the 
property is a safety property [9], for which the above mentioned satisfaction 
relation is appropriate. On the other hand, for Oain, this satisfaction relation is 
obviously not appropriate - clearly we cannot expect for all the execution prefixes 
of a mutual exclusion algorithm, a particular process is in critical section. The 
second property is a liveness property [9]. 

There are some efforts to develop logics over infinite intervals [11,16,19]. With 
such logics, it is possible to use set containment as the satisfaction relation. 
However, logics over infinite intervals have not been widely accepted due to 
several reasons. Firstly, length of interval is an important concept in ITL, but 
for an infinite interval, this leads to an infinite number, which many people do 
not feel comfortable to deal with. Secondly, ITL can be decided by using finite 
automata when it is over finite intervals [5,12], but obviously needs automata 
over infinite words when the logic is defined over infinite intervals. Even worse, 
effective automaton constructing techniques for LTL do not seem to apply to 
interval logics as observed by Wolper [17]. 

In this paper, we propose to use the usual interval logic over finite intervals 
as the specification logic, and redefine the satisfaction relation. We identify two 
classes of properties, and accordingly define two satisfaction relations. The two 
relations are unified by a more general definition, and a method to automatically 
check whether a reactive system satisfies properties in interval logics according 
to this new definition is investigated. This turns out to be checking containment 
between the set of (infinite) behaviors of the reactive system and the language 
of a Biichi automaton which is the same as the finite automaton for the interval 
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logic formula except the final states of the finite automaton are now the accepting 
states of the Biichi automaton. 

As a particular application of our method, we have chosen a decidable subset 
of Discrete Time Duration Calculus [5] as the specification logic and developed a 
model checking tool, integrated with the Spin model checking system, to verify 
the proposed general satisfaction relation. 

The rest of the paper is organized as follows. In the next section, we give 
the syntax and the semantics of the logic, and the construction that converts a 
formula to a regular language representing the models of the formula. In Sec- 
tion 3, two kinds of properties, finitary safety property and eventual persistence 
property, are defined, followed by methods to recognize them. Satisfaction re- 
lations for finitary safety property and eventual persistence property are first 
studied separately and then unified in Section 4. A method to model check a 
reactive system with the proposed satisfaction relation is shown in Section 5. 
In Section 6, as a case study, Peterson’s mutual exclusion algorithm is model 
checked. The paper ends with a discussion. 

2 Quantified RDC 

Quantified RDC(QRDC)is an extension of the decidable subset of Duration Cal- 
culus by introducing quantifiers on state variables. 

We use the following notations throughout the paper. For a finite set A, A+ 
denotes the set of all non-empty finite sequences over A and A* = A+ U {e}, 
where £ is the unique empty sequence. We use A“ to denote the set of infinite 
sequences, i.e. w-sequences, over A. A°° = A* U A“. For a sequence a over A, 
|a| denotes the length of a. Specially, |o;| = oo for a G A“ and |e| = 0. For 
two sequences a\ G A* and «2 G A°°, a\ ■ ai denotes the sequence obtained 
by concatenating «2 to the end of Oi; a\ is called a prefix of a 2 , denoted as 
oil ^ oi 2 , iff 3«3 G A°° : oi • «3 = « 2 - For two sets of finite sequences P\ and 
P 2 , Pi ■ P 2 = {oil ■ Ci2\oii G Pi A «2 G P2}- 

2.1 Syntax 

We use p, pi, ■ ■ ■ , Pn to denote state variables and SV ar the finite set of all the 
state variables. State expressions and formulas are defined as follows: 

5 ::= 0|l|ph^i|5iV^2 

(p ::= IfS'H I -•(/) I (/)! V (^2 I (pi] (p2 I 3p. (p 



2.2 Semantics 

The interpretation for state variables is given as a function 

/ G SVar — >• {Time — >■ {0, 1}) 

When we choose Time to be N, we obtain QRDC in discrete time (DQRDC). 
The interpretation function I is extended to state expression S by induction on 
the structure of S: 
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1 . /I0](<) = 0 

2 . mit) = 1 

3- iMb) = 

4. = 

5. IIS’! V 521(f) = ’^ax(/|5i](f),/|52](f)) 

Given an interpretation I, the semantics of a QRDC formula (j) is given by a 
function 

im G Intv -)> {tt, //}, 
where the set of time intervals is defined as 

Intv = {[6, e] |&, e G Time A 6 < e}. 

The function is defined as follows: 

1. I|[5]]([6, e]) = if iff 6 < e and /J5|5](f) = e — & for dense time or & < e 
and 27®rj)/|5](f) = e — 6 for discrete time 
2- Il~^(l)j{[b,e]) = tt iff Il(j)]{[b,e]) = ff 

3. Ilcjji V (j) 2 ll[b,e\) = tt iff e]) = tt or /|</)2]([6, e]) = tt 

4. ())2]([fe, e]) = tt iff m]) = tt and /|02l([w, e]) = tt, for some 

m G [b, e] n Time 

5. /pp. e]) = tt iff for some interpretation which is p-equivalent to I, 

= tt. 

An interpretation I' is p-equivalent to another interpretation / iff I(pi)(t) = 
I'{pi){t) for all Pi ^ p and t G Time. 

Standard abbreviations from predicate logic, e.g. “A”, “=>” and are 

used in this paper. Moreover, the following abbreviations for QRDC formulas 
are frequently used: 

/alse =[[0]| true = -I false 

0(j) = true; (j); true □(/> A 

Op(f> = -i(-i(f>;true) 1 = 0 = -'[[1]1 

For DQRDC, we also use the following abbreviations (/c G N): 

/ = 1 = PI A -(PI; PP fS=l = (fS = 0); ([51 A 1 = 1); (fS = 0) 

/S- = 0= [-51 VI = 0 /S'=fc-hl = (/5 = p;(/5=l) (k ^ 1) 

fS^k=(fS=k);true fS>k= fS^k+1 

JS^k = -n(fS >k) fS<k= fS^k-1 

where f S denotes the accumulated time when 5 evaluates 1 . The length I of 
current interval can be encoded as fl. 

2.3 Segments as Models 

We call each obs G Obs = 2'®^“’' an observation. For a state expression 5, 
JV(S) C Obs denotes the set of observations where 5 is evaluated to 1: 

JV(S) = {obsj( /\ PiA /\ -P2) ^ 5}. 

Pi Gobs p2G(SVar—obs) 
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For an interpretation / and t G Time, I(t) is identified with the following 
observation 



I{t) = {p\p G SVar A I{p){t) = 1} 

In DQRDC, given an interpretation I and an interval \b,e], we use the pair 
(/, \b, e]), called a segment, to denote the finite sequence I{h),I{b+l), ■ ■ ■ , /(e— 1) 
of observations. A segment (/, \b, e]) satisfies a formula </>, denoted as I, \b, e] \= <j), 
iff /|0]([&, e]) = tt. For a formula <l>, we call each segment {I,[b,e\) satisfying 
I, [b, e] \= (j) a, model of </>. The set of models of </>, denoted as |(?i] , can be 
constructed [5,12] as the following regular language over the alphabet Obs: 

■^IIF'5'111 = {N{S))'^ (positive closure) 

Ml^i W 4>2j = Ml4>ij U Ml4>2j (union) 

!“'<(’] = Obs* — M |0] (complementation) 

(j) 2 \ = M{(j)i\- M |</> 2 ] (concatenation) 

M\^p. (j)\ = Equivalp{M\(j)\) (equivalence closure) 

where Equivalp{M.\(j)\) is defined as follows. 

Definition 1. Given two finite sequences a = so: si, • • • ,Sn and a' = s'q, s^, • • • , 
sjj of observations and a state variable p, a and a' are p-equivalent ijf Si\{p} = 
s'\{p} for 0 ^ f < n. 

Given a state variable p, the equivalence closure of a set of observation se- 
quences Ai, Equivalp{M) is defined to he the set 

{a I if there exists fd G Ai, such that a and (3 are p-equivalent} 

We can easily prove the following lemma by induction on the structure of </>. 

Lemma 2. In DQRDG, for a formula (j) and a segment (/, [6, ej), (/, [6, ej) G 
Mm iff I, [b,e] \= (j). 

In DQRDC, a formula (j> is valid iff AI|0] = Obs*, and it is satisfiable iff 

Mm ^9. 



3 Finitary Safety Property and Eventual Persistence 
Property 

Our work on classification of properties follows that of [9,3]. However, in our 
setting, a property is a set of finite sequences of states, and therefore some 
modification is necessary. To indicate whether properties are formed by finite or 
infinite sequences of states, we call them finitary properties or infinitary prop- 
erties. 
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3.1 Finitary Safety Property 

A safety property stipulates that “something (bad) never happens”. In [3], it is 
considered that if a “bad” thing happens in an infinite sequence, then it must 
do so after some finite prefix and must be irremediable. We follow this view, but 
since our property is over finite sequences, we call our property finitary safety 
property and it is formally defined as follows: 

Definition 3 (finitary safety property). For a finite set S and a property 
P Q E* , P is a finitary safety property, or an FS property, iff: 

Va G A* : (a ^ P 3/3 G A* : (/3 ^ a A Vy G A* : (/3 ^ 7 ^ 7 ^ P))). 



By taking negation on both sides, we have 

Va G r* : (a G P V/3 G A* : (/3 ^ a ^ 3y G A* : (/3 ^ 7 A 7 G P))). 

This can be simplified, and we have the following theorem which says that a 
property is an FS property iff it is prefix-closed. 

Theorem 4. A non-empty property P C E* is an FS property iff 

Va G A* : (a G P V/3 G A* : (/? ^ a ^ /3 G P)), 

i.e. Pref{P) = P, where Pref{P) = {j3\(3 & E* f\3a & P \ j3 < a}. 

3.2 Eventual Persistence Property 

An infinitary property is a liveness property [3] iff all finite executions can be 
extended to satisfy the property. Intuitively, this is to say a “good” thing always 
has the chance to take place no matter what has happened. Although in the 
general case, the “good thing” may take infinite number of actions to ensure, 
for example, something that happens infinitely often, some “good thing” can 
and in fact is only meaningful to occur after a finite number of steps. This 
is often called eventuality. Moreover, we are particularly interested in a class of 
properties such that if something (“good”) has happened, nothing later can undo 
it. This is known as persistence. Formally, eventuality and persistence properties 
are defined as: 

Definition 5 (Eventuality and Persistence). For a finite set E and a prop- 
erty PQ E*{Pf^%), 

— P is an eventuality property zjf Va G A* : 3/3 G A* : (a ^ /3 A /3 G P); 

— P is a persistence property iff'ia G P : V/3 G A* : (a ^ /3 => /3 G P). 

P is called an eventual persistence property, or an EP property, iff P is both an 
eventuality property and a persistence property. 

The following theorem follows directly from the definition. 

Theorem 6. A non-empty property P C E* is an EP property iff Pref{P) = 
E* and P-E* = P. 
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3.3 Recognizing FS and EP Properties 

A finitary property can be specified by a finite automaton whose language is 
the set of the finite sequences satisfying the property. In the following, similar to 
what has been done in [2] w.r.t. Biichi automata expressing safety properties and 
liveness properties, we present the rules for recognizing FS and EP properties 
from automata theory point of view. 

A finite automaton FA is reduced if, for any state q, there is a path from 
an initial state to q and there is a path from g to a final state. From a finite 
automaton, a reduced FA accepting the same language can always be obtained 
by deleting those states either not reachable from any initial state or from which 
no final state can be reached. For a reduced FA, we define its closure cl{F A) 
as the result automaton by setting all of its states as final states. From the 
construction of d{F A), we can easily get 

Lemma 7. For a reduced FA, L{cl{F A)) = Pref{L{F A)). 

We have from Theorem 4 and Lemma 7 the following Theorem for FS properties: 



Theorem 8. A reduced finite automaton FA specifies an FS property iff 
L{FA) = L{cl{FA)). 

For EP properties, we have the following similar theorem, which follows from 
Theorem 6 and Lemma 7: 

Theorem 9. A finite automaton FA specifies an EP property iff L {cl {F A)) = 
S* and L{FA) ■ S* = L{FA). 

From Theorems 8 and 9, algorithms for recognizing FS and EP properties 
have been developed [18]. 

3.4 FS Formulas and EP Formulas in DQRDC 

In DQRDC, each formula <f> defines a finitary property AI|(^]. 4> is called an 
FS/EP formula iff is an FS/EP property. For any DQRDC formula, we 

can always semantically identify whether it is an FS/EP formula or not by 
examining the automaton accepting M |0] . The following two theorems provide 
syntactical conditions which are sufficient (but not necessary) to judge these two 
kinds of formulas. Due to limitation of space, the detailed proofs are omitted, 
and can be found in [18]. 

Theorem 10 (FS formula). Every DQRDC formula of the form [[-S']]*, JS ^ 
k{k € N), C\p<f> or □/) is an ES formula and if fii, /'2 are ES formulas, then so 
are A (j) 2 > V </> 2 , 4>i',4>2 and 3p.4>i- 



Theorem 11 (EP formula). Every DQRDC formula of the form JS ^ k{S yf 
0 and fc G N) and <>(f> is an EP formula and if <f>i, 4>2 are EP formulas, then so 
are A (j) 2 , 4>i V 4>2, 4>i',4>2 and 3p.(pi. 
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4 Satisfaction Relations 

A finite sequence a satisfies a finitary property P, denoted hy a \= P, iS a € P. 
For an w-sequence, we first look at the satisfaction relations for FS and EF 
properties separately. 

For an FS property, an w-sequence satisfies the property iff all the finite 
prefixes of the sequence satisfy it. 

Definition 12. For an oj-sequence <7 and an FS property P all defined over P, 
a satisfies P, denoted as a \=i P, iffVa € E* : {a < a ^ a € P) . 

Intuitively, an w-sequence satisfies an EP property iff it has a finite prefix 
where the “good” thing has happened. 

Definition 13. For an co-sequence a and an EP property P all defined over E, 
a satisfies P, denoted as a \=i P , iff 3a € E* : (a < a A a € P) . 

The two satisfaction relations can be unified by a more general one. 

Definition 14 (satisfaction of finitary property). For an co-sequence a and 
a finitary property P, both defined over E, <r satisfies P, denoted as a \=i P, iff 
G A* : (7 ^ (T A Va & E* •. {y a < a ^ a \= P)). 

Intuitively, this says that an w-sequence satisfies a finitary property if all except 
a finite number of the finite prefixes satisfy the property. 

Lemma 15. Definition If is equivalent to Definition 12 for FS properties and 
Definition 13 for EP properties. 

Proof. For an FS property P, Theorem 4 says that P is prefix-closed, and it 
follows that for any w-sequence a over A, Va G A* : (a ^ ct a G P) iff 

G A* : (7 ^ CT A Va £ E* : {'j a a ^ a G P)). 

Namely, Definitions 12 and 14 are equivalent for FS properties. 

For an EP property P, Definition 5 indicates that if a prefix satisfies P, 
then all its extensions satisfy P too. Therefore, for any w-sequence a over E, 
3a G A* : (a ^ cr A a G P) iff 

37 G A* : (7 ^ CT A Va £ E* : {'j a a ^ a £ P)). 

Subsequently, Definitions 13 and 14 are equivalent for EP properties. □ 



5 Model Checking 



Model checking [4] is a verification technique that involves constructing a model 
of a system and checking if a desired property holds in that model. In this section, 
we study model checking finitary properties specified in DQRDC. 
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For a system S, 'H|S'] denotes the set of all behaviors, i.e., w-sequences of 
system states, of S, and S satisfies a DQRDC formula (f) expressing finitary 
property iff Vct G 'H|S'] : a \=i </>, i.e. 

Vct G 'H| 5 '] : (37 G Obs* : (7 ^ ct A Va G Obs* \ ^ a < a ^ a\= 4 >))). 

Subsequently, S does not satisfy iff 

3(7 G 'H|S'] : (V 7 G Obs* : (7 ^ ct 3a G Obs* : 4>)))- 

We can construct a finite automaton FA{-'(f)), which accepts precisely all the 
finite sequences not satisfying (j), thus the above condition can be rewritten as 

3ctG’H|S'] : (VyG 06s* : (7 ^ a^BaGObs* : (7 ^ a ^ cr A aG L{F A{->(f>))))). 

When FA{-'(f>) is deterministic, referred to as DF A{-^4>), we shall prove this 
condition is equivalent to 

3a G nlSj : a G L{D B A{-^(j))) , 

where DBA{-<(j)) is the deterministic Biichi automaton [1] obtained by taking 
final states of DFA{-t<j)) as accepting states. 

Therefore, to determine whether S satisfies 4>, it suffices to check whether 
'H|S'] n L{DBA{-i<p)) = 0. If the intersection is empty, S satisfies (j), and if not, 
S does not satisfy <j). 

Lemma 16. For an co-sequence a and a deterministic finite automaton DF A, 
let DBA be the deterministic Biichi automaton obtained by taking all the final 
states of DF A as accepting states, then 

Va G Obs* : a ^ a : (3y G Obs* : a ^ 7 ^ a : a G L{DF A)) iff a G L{DBA). 

Proof. The if case is obvious and we only prove the only if case. From the 
assumption, there exists an infinite series of finite sequences ai, 02 , • • • , with 
^ ^ ^ a, aj yf a^+i and at G L(DFA) for all i 1. Obviously, ai ends 

in a final state. Because the automaton is deterministic, 02 will first go through 
the same states as ai, in particular, the first final state, and then ends in a 
(possibly different) final state. Therefore, a 2 goes through the final states which 
are the accepting states of the Biichi automaton at least twice. By induction, we 
know that ai goes through the accepting states of the Biichi automaton at least 
i times. Since there are infinite many such ai’s, some accepting states are gone 
through infinite number of times by a, and therefore a G L(DBA). □ 

Note in the above proof, the assumption that the automaton is deterministic 
is important. However, this is not necessary for checking FS properties. If (f> is 
an FS property, in FA{->fi), any input symbol will lead a final state to (another) 
final state. Informally this is because for an FS property, if something “bad” has 
happened in a behavior, any extension of it is still a bad behavior. Therefore, 
if a prefix of a has reached a final state, any further input will lead to a final 
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state, and subsequently a is accepted by the corresponding Biichi automaton. 
Since determinization can be expensive, it is in general more efficient to use 
non-deterministic automata for model checking FS properties. 

Spin [7] is a widely distributed software system for on-the-fly model checking. 
In Spin, the system is modelled using Promela, its input language, as a set 
of interacting processes. The negation of the desired property is encoded in a 
temporal claim, which also takes the form of a Promela program. Each process is 
translated into an automaton, and the global behavior of the system is obtained 
by computing the asynchronous product of these automata. The temporal claim 
is translated to a Biichi automaton. The synchronous product of this automaton 
and the automaton representing the global behavior of the system is computed. If 
the language accepted by the resulting Biichi automaton is empty, no behavior of 
the system satisfies the negation of the desired property, i.e. the system satisfies 
the property. Otherwise, the language contains some system behaviors violating 
the desired property. 

We have implemented the algorithm which takes a DQRDC formula as input 
and generates the temporal claim in Promela. This allows us to model check 
properties studied in this paper when they are expressed as DQRDC formulas 
using Spin. Our implementation makes use of the C++ package for finite state 
machine, Grail+ [8]. The compiler construction program Parser Generator from 
Bumble-Bee Software Ltd. is also used. 



6 Case Study 

In this section, we study model checking of the well-known Peterson’s mutual 
exclusion algorithm. In Fig. I, the algorithm for the case of two processes is 
modelled in Promela. 



1 


#define true 1 


22 


> 


2 


#define false 0 


23 




3 


#define aturn 1 


24 


proctype b() 


4 


#define bturn 0 


25 


-c 


5 




26 


do 


6 


bool areq, breq, turn; 


27 


:: atomic{ breq=true; 


7 


bool ain; 


28 


turn=aturn; } 


8 


bool bin; 


29 


(areq==false I I turn==bturn) ; 


9 




30 


bin=true ; 


10 


proctype a() 


31 


/* critical section */ 


11 




32 


/* assert (ain==f alse) ; */ 


12 


do 


33 


bin=f alse ; 


13 


:: atomic{ areq=true; 


34 


breq=f alse ; 


14 


turn=bturn; } 


35 


od 


15 


(breq==false I | turn==aturn) ; 


36 


} 


16 


ain=true ; 


37 




17 


/* critical section cs */ 


38 


init 


18 


/* assert (bin==false) ; */ 


39 


{ 


19 


ain=f alse ; 


40 


run a() ; run b() ; 


20 


areq=f alse ; 


41 


} 


21 


od 







Fig. 1. Peterson’s mutual exclusion algorithm 
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The algorithm uses three global variables turn, areq and breq. The state- 
ments on lines 14 and 26, which can only be executed if the boolean equations 
in them evaluate to true, synchronize the processes. If there is only one process 
that applies to enter the the critical section, the entry is immediate. Variable 
turn is used to decide which process to enter the critical section when both 
processes have requested. Variable ain(bin) is used here to facilitate property 
specification and it is set true iff process a(b) is in the critical section. 

The essential property of a mutual exclusion algorithm is naturally mutual 
exclusion, which can be easily specified in DQRDC as 

Reqi = □(? > 1 [f-i(afn A hiny\) 

The generated corresponding Biichi automaton in Promela is as shown in Fig. 2. 



1 never { 

2 TO_init: 

3 if 

4 : : goto T1 

5 fi; 

6 Tl: 

7 if 

8 ( (true) ) ->goto Tl 

9 ((lain II !bin))->goto T2 

10 :: ((ain && bin))->goto accept_5 

11 fi; 



12 T2: 

13 if 

14 ((lain II !bin))->goto T2 

15 :: ((ain && bin))->goto accept_5 

16 fi; 

17 accept_5: 

18 if 

19 :: ( (true) ) ->goto accept_5 

20 fi; 

21 } 



Fig. 2. Temporal claim for Req\. 



The mutual exclusion property is simple and can be expressed using asser- 
tions (as shown on lines 17 and 29) or a simple monitor process. Next we consider 
a more involved property, 

if one process has requested to enter its critical section when the 
other process is in the critical section, then the first process must 
have entered the critical section before the second one enters the 
critical section for another time. 

This property can be expressed as follows w.l.o.g.: 

Req 2 = □ ( |[ areq A bin '^ ; [[ -'bin '^ ; [[ binj O [[of n]] ) 

The generated Biichi automaton in Promela is shown in Fig. 3. 

Taking the temporal claim and the system model as input, the Spin model 
checker shows that the properties are satisfied. 



7 Discussion 

In this paper, we have studied the automatic verification of reactive systems 
against interval logic specifications. Although various interval logics have been 
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developed for some time, there is little work with similar goal, which is somewhat 
unusual in view of extensive research regarding LTL. In fact, there have been no 
clear definitions about what it means for a reactive system to satisfy properties 
in interval logics. We have considered two classes of common properties and 
for them defined satisfaction relations that have natural meanings. The two 
satisfaction relations are unified by a more general one for which model checking 
support has been provided. We are not clear if the general relation is meaningful 
for properties outside the two classes and their simple combinations. 

Our work is useful to people who prefer to use interval logics. In fact, both 
LTL and interval logics can be used together in specifying properties of a system. 
For example, in Peterson’s mutual exclusion algorithm case study, one can use 
the interval logic to specify Req 2 as we have done, since this property may not 
be easy to express in LTL, but use LTL to specify other properties for which 
LTL is better or just as good. 

Although the algorithm converting the interval logic formulas to finite au- 
tomata is simple and well-known, there have been few implementations. The 
early BDD-based implementation by Skakkebaek and Sestoft is integrated into 
a proof assistant [15] developed on an old version of PVS which has not been 
supported for years. The more recent implementation by Pandya [12] uses the 
Mona [6] system by mapping the interval logic to the language of Mona, a 
monadic second-order logic. Although the implementation of Mona is highly 
efficient, its sophistication also makes it difficult to handle. What we really need 
is a package on finite automata, and with Grail-|-, we have a better control of 
the implementation. For example, we can control whether to determinize the 
automaton, which is an expensive operation. 

Pandya has also studied model checking reactive systems with interval logics. 
In his approach, the interval logic does not stand alone and is combined into other 
logics, for example, into CTL [13]. Pandya has apparently considered combining 
the interval logic with LTL, and using Spin to check properties expressed in the 
new logic. The resulting logics are substantially different from existing ones and 
considerable work may be needed before they are accepted. 



1 never { 

2 TO.init: 

3 if 

4 : : goto TO 

5 fi; 

6 TO: 

7 if 

8 :: ((true)) -> goto TO 

9 ((areq && bin && !ain))->goto T2 

10 fi; 

11 T2: 

12 if 

13 ((!bin && lain)) -> goto T6 

14 :: ((areq && bin && !ain))->goto T2 

15 fi; 



16 T6: 

17 if 

18 :: ((Ibin && lain)) -> goto T6 

19 :: ((bin && lain)) -> goto accept_14 

20 fi; 

21 accept_14: 

22 if 

23 ((true)) -> goto accept_19 

24 :: ((bin && lain)) -> goto accept_14 

25 fi; 

26 accept_19: 

27 if 

28 :: ((true)) -> goto accept_19 

29 fi; 

30 > 



Fig. 3. Temporal claim for Req 2 - 
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Abstract. The finite powerset construction upgrades an abstract do- 
main by allowing for the representation of finite disjunctions of its el- 
ements. In this paper we define two generic widening operators for the 
finite powerset abstract domain. Both widenings are obtained by lifting 
any widening operator defined on the base-level abstract domain and are 
parametric with respect to the specification of a few additional operators. 
We illustrate the proposed techniques by instantiating our widenings on 
powersets of convex polyhedra, a domain for which no non-trivial widen- 
ing operator was previously known. 



1 Introduction 

The design and implementation of effective, expressive and efficient abstract 
domains for data-flow analysis and model-checking is a very difficult task. For 
this reason, starting with [11], there continues to be strong interest in techniques 
that derive enhanced abstract domains by applying systematic constructions 
on simpler, existing domains. Disjunctive completion, direct product, reduced 
product and reduced power are the first and most famous constructions of this 
kind [11]; several variations of them as well as others constructions have been 
proposed in the literature. 

Once the carrier of the enhanced abstract domain has been obtained by 
one of these systematic constructions, the abstract operations can be defined, as 
usual, as the optimal approximations of the concrete ones. While this completely 
solves the specification problem, it usually leaves the implementation problem 
with the designer and gives no guarantees about the efficiency (or even the 
computability) of the resulting operations. This motivates the importance of 
generic techniques whereby correct, even though not necessarily optimal, domain 
operations are derived automatically or semi-automatically from those of the 
domains the construction operates upon [8,11,18]. 

This paper focuses on the derivation of widening operators for a kind of dis- 
junctive refinement we call finite powerset construction. As far as we know, this 

* This work has been partly supported by MURST projects “Aggregate- and Nnmber- 
Reasoning for Compnting: from Decision Algorithms to Constraint Programming 
with Multisets, Sets, and Maps” and “Constraint Based Verification of Reactive 
Systems.” 
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is the first time that the problem of deriving non-trivial, provably correct widen- 
ing operators in a domain refinement is tackled successfully. We also present its 
specialization to finite powersets of convex polyhedra. Not only is this included 
to help the reader gain a better intuition regarding the underlying approach but 
also to provide a definitely non-toy instance that is practically useful for ap- 
plications such as data-flow analysis and model checking. Sets of polyhedra are 
implemented in Polylib [24,28] and its successor Poly Lib [25], even though no 
widenings are provided. Sets of polyhedra, represented with Presburger formulas 
made available by the Omega library [23,26], are used in the verifier described 
in [7]; there, an extrapolation operator (i.e., a widening without convergence 
guarantee) on sets of polyhedra is described. Another extrapolation operator is 
implemented in the automated verification tool described in [16], where sets of 
polyhedra are represented using the clp(q, r) constraint library [22]. 

The rest of the paper is structured as follows: Section 2 recalls the basic 
concepts and notations; Section 3 defines the finite powerset construction as a 
disjunctive refinement for any abstract domain that is a join-semilattice; Sec- 
tion 4 presents two distinct strategies for upgrading any widening for the base- 
level domain into a widening for the finite powerset domain; Section 5 provides 
a technique for controlling the precision/efficiency trade-off of these widenings; 
Section 6 concludes. The proofs of all the stated results can be found in [5]. 

2 Preliminaries 

For a set S, p{S) is the powerset of S, whereas pf{S) is the set of all the finite 
subsets of S] the cardinality of S is denoted by # S'. The first limit ordinal is 
denoted by uj. Let O be a set equipped with a well-founded ordering V’. If M 
and N are finite multisets over O, #(n, M) denotes the number of occurrences 
of n G O in M and M ^ N means that there exists j G O such that #(j, M) > 
#{j,N) and, for each k G O with k >- j, we have ^{k,M) = ^{k,N). The 
relation is well-founded [17]. 

In this paper we will adopt the abstract interpretation framework proposed 
in [13, Section 7], where the correspondence between the concrete and the ab- 
stract domains is induced from a concrete approximation relation and a con- 
cretization function. Since we are not aiming at maximum generality, for the 
sole purpose of simplifying the presentation, we will consider a particular in- 
stance of the framework by assuming a few additional but non-essential domain 
properties. 

The concrete domain is modeled as a complete lattice of semantic properties 
(C, C, T, T, U, n); as usual, the concrete approximation relation ci C C 2 holds 
if Cl is a stronger property than C 2 (i.e., C 2 approximates ci). The concrete 
semantics c G C of a program is formalized as the least fixpoint of a continuous 
(concrete) semantic function C ^ C, which is iteratively computed starting 
from the bottom element, so that c = 1F“(T). 

The abstract domain D = (I?,l-,0,©) is modeled as a join-semilattice (i.e., 
the least upper bound di © exists for all di.d^ G D). We will overload ‘©’ 
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SO that, for each S £ pi(£’), ©-S' denotes the least upper bound of S. The 
abstract domain D is related to the concrete domain by a monotonic and injective 
concretization function 7 : D ^ C. Monotonicity and injectivity mean that the 
abstract partial order ‘h’ is indeed the approximation relation induced on D 
by the concretization function 7. For all di,d ,2 £ D, we will use the notation 
di Ih ^2 to mean that di h ^2 and di yf d^- We assume the existence of a 
monotonic abstract semantic function ■. D ^ D that is sound with respect to 



Vc G C : Vd G : c C 7(d) T{c) C -i[P{d)). (1) 

This local correctness condition ensures that each concrete iterate can be safely 
approximated by computing the corresponding abstract iterate (starting from 
the bottom element 0 £ D). However, due to the weaker algebraic properties 
satisfied by the abstract domain, the abstract upward iteration sequence may 
not converge. Even when it converges, it may fail to do so in a finite number of 
steps, therefore being useless for the purposes of static analysis. 

Widening operators [9,10,13,14] provide a simple and general characterization 
for enforcing and accelerating convergence. We will adopt a minor variation of 
the classical definition of widening operator (see footnote 6 in [14, p. 275]). 

Definition 1. (Widening.) Let (D,l-,0,©) be a join- semilattice. The partial 
operator V \ D y. D ^ D is a widening if 

1. di h d2 implies d2 F di V d2 for each di, d2 G D; 

2. for each increasing chain do F di F • • • , the increasing chain defined by 
dp := do and d'_|_j^ := d' V (d' © di+i), for i £fj, is not strictly increasing. 

Any widening operator ‘V’ induces a corresponding partial ordering ‘Fv’ on the 
domain D; this is defined as the reflexive and transitive closure of the relation 
{ (di, d) G id X D I 3d2 G id . di IF d2 A d = di V d2 } . The relation ‘Fv’ satisfies 
the ascending chain condition. We write d\ IFv d2 to denote di Fv d2 A di d2. 

It can be proved that the upward iteration sequence with widenings starting 
at the bottom element dp := 0 and defined by 



di+i 



di, ii T^{di)\- d^, 

di V (di © d^**(di)), otherwise, 



converges after a finite number j G N of iterations [14]. Note that the widening 
is always applied to arguments d = di and d' = di © d^*(di) satisfying d IF d'. 
Also, when condition (1) holds, the post-fixpoint dj G I? of IFMs a correct 
approximation of the concrete semantics, i.e., T^{1) C '){dj). 



2.1 The Abstract Domain of Polyhedra 

We now instantiate the abstract interpretation framework sketched above by 
presenting the well-known abstract domain of closed convex polyhedra. This 
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domain will be used throughout the paper to illustrate the generic widening 
techniques that will be defined. 

Let K", where n > 0, be the n-dimensional real vector space. The set V C K” 
is a closed and convex polyhedron {polyhedron, for short) if and only if V can 
be expressed as the intersection of a finite number of closed affine half-spaces of 
K". The set CP„ of closed convex polyhedra on K”, when partially ordered by 
subset inclusion, is a lattice having the empty set and K” as the bottom and top 
elements, respectively; the binary meet operation is set-intersection, whereas the 
binary join operation, denoted by ‘l±l’, is called convex polyhedral hull {poly-hull, 
for short). Therefore, we have the abstract domain 

CP„ := (CP„,C,0,K”,l±l, n). 

This domain can be related to several concrete domains, depending on the in- 
tended application. One example of a concrete domain is the complete lattice 

A„:=(p(K”),C, 0 ,R",U,n). 

Note that CP„ is a meet-sublattice of A„, sharing the same bottom and top ele- 
ments. Another example is the complete lattice B„ := (pc(M"), C, 0 , K", Uc, o), 
where pc(R”) is the set of all topologically closed and convex subsets of K” 
and the join operation ‘Uc’ returns the smallest topologically closed and convex 
set containing its arguments. As a final example of concrete domain for some 
analysis, consider the complete lattice C„ := (p(CP„), C, 0 , CP„, U, fl). 

The abstract domain CP„, which is a join-semilattice, is related to the con- 
crete domains shown above by the concretization functions 7 *: CP„ — >■ p(M"), 
7 ®: CP„ — >■ pc(R") and 7 '-: CP„ — >■ p(CP„): for each V G CP„, we have both 
.= p and 7 ®(iP) := V, and -f^{V) := := { Q G CP„ | QQV}. All 

these concretization functions are trivially monotonic and injective. 

For each choice of the concrete domain C G {p(K"), pc(R"), p(CP„)}, the 
continuous semantic function T \ C ^ C and the corresponding monotonic ab- 
stract semantic function : CP„ — >■ CP„, which is assumed to be correct, are 
deliberately left unspecified. The domain CP„ contains infinite ascending chains 
having no least upper bound in CP„. Thus, the convergence of the abstract 
iteration sequence has to be guaranteed by the adoption of widening operators. 

The first widening on polyhedra was introduced in [15] and refined in [19]. 
This operator, denoted by ‘V^’, has been termed standard widening and used 
almost universally. In [4], we presented a framework designed so that all its 
instances are widening operators on CP„. The standard widening ‘Vg’ is an 
instance of the framework and all the other instances, including the specific 
widening ‘V’ defined and experimentally evaluated in [4], are at least as precise 
as ‘Vg’. For a formal definition of both ‘Vg’ and ‘V’, we refer the reader to [4]. 

3 A Disjunctive Refinement 

In this section, we present the finite powerset operator, which is a domain re- 
finement similar to disjunctive completion [ 11 ] and is obtained by a variant of 




Widening Operators for Powerset Domains 139 



the down-set completion construction presented in [12]. The following notation 
and definitions are mainly borrowed from [2, Section 6]. 

Definition 2 . (Non-redundancy.) Let D = (D,l-,0,©) be a join- semilattice. 
The set S G p{D) is called non-redundant with respect to ‘\~ ’ if and only if 
0 ^ S and \/d\,d2 € S : di \~ d2 di = d2- The set of finite non-redundant 
subsets of D (with respect to is denoted by pfn(D,h). The reduction func- 
tion Qj,-. pi{D) — >■ pfn(I?,l-) mapping each finite set into its non-redundant 
counterpart is defined, for each S € p{(L>), by 

12 ^( 5 ') := S\{dG S \ d= 0 V 3 d' G S .d\h d'}. 

The restriction to the finite subsets reflects the fact that here we are mainly 
interested in an abstract domain where disjunctions are implemented by explicit 
collections of elements of the base-level abstract domain. 

Definition 3. (Finite powerset domain.) Let D := (D,l-,0,©) be a join- 
semilattice. The finite powerset domain over D is the join- semilattice 

Dp ■= ^pfn(D, h), hp, Op, ©p), 

where Op := 0 and, for all 81,82 G pfn{D,\~), 81 ©p 82 ■= D(,{8i U 82). 

The approximation ordering ‘hp’ induced by ‘©p’ is the Hoare powerdomain 
partial order [1], so that S'! hp 82 if and only if Vdi G : 3^2 G S '2 ■ di h d2. A 
sort of Egli-Milner partial order relation [1] will also be useful: heM 82 holds 

if and only if either 81 = Op or 81 hp 82 and \/d2 G 82 ■ 3di G 8\ . di\- d2. An 
(Egli-Milner) connector for Dp, denoted by ‘ Hem’ is any upper bound operator 
for the Egli-Milner ordering on pfn(D, h). Note that although a least upper 
bound for ‘hpivi’ may not exist, a connector can always be defined; for instance, 
we can let 81 Hem 82 := {©(S'! U S' 2 )}. 

Besides the requirement on finiteness, another difference with respect to the 
down-set completion of [12] is that we are dropping the assumption about the 
complete distributivity of the concrete domain. This is possible because our 
semantic domains are not necessarily related by Galois connections, so that this 
property does not have to be preserved. 

The finite powerset domain is related to the concrete domain by means of 
the concretization function jp-. pf„(I?,h) — >■ C defined by 

7p(S') := |J{7(c^) I d G A}. 

Note that 7 p is monotonic but not necessarily injective. For S'i,S '2 G pfn(D, h), 
we write 81 =jp 82 to denote that the two abstract elements actually denote the 
same concrete element, i.e., when 7p(5'i) = 7 p(S' 2 ). It is easy to see that ‘= 7 p’ 
is a congruence relation on Dp. As noted in [12], non-redundancy only provides 
a partial, syntactic form of reduction. On the other hand, requiring the full, 
semantic form of reduction for a finite powerset domain can be computationally 
very expensive. 
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A correct abstract semantic function : pfn(-D, h) — >• pi-a{D, h) on the finite 
powerset domain may be provided by an ad-hoc definition. More often, if the 
concrete semantic function T\ C ^ C satisfies suitable hypotheses, tI can be 
safely induced from the abstract semantic function ■. D ^ D. For instance, if 
T is additive, we can define as follows [ 11 , 18 ]: 

3.1 The Finite Powerset Domain of Polyhedra 

The polyhedral domain (CP„)p, having carrier pf„(CP„, C), is the finite powerset 
domain over CP„. The approximation ordering is ‘Cp’ where, for each 5i,iS2 G 
Pfn(CP„,C), 

S\ Cp S2 — y VPi G S\ 31-^2 G S2 ■ 'Pi C P 2 - 

Let 7p, 7p and 7p denote the (powerset) concretization functions induced by 
7*, 7*® and 7'”, respectively. Then, the relation ‘=.^a’ makes two finite sets of 
polyhedra equivalent if and only if they have the same set-union. The gen- 
eral problem of deciding the semantic equivalence with respect to 7p of two 
finite (non-redundant) collections of polyhedra is known to be computationally 
hard [ 27 ]. For 7®, the relation ‘=.yB’ makes two finite sets of polyhedra equivalent 
if and only if they have the same poly-hull, so that the powerset construction 
provides no benefit at all. Finally, 7p is injective so that ‘=^c’ coincides with the 
identity congruence relation. 

Example 1 . For the polyhedral domain (CPi)p, let^ 

To := {{0 < X < 2},{1 < X < 2},{3 < a; < 4},{4 < a; < 5 }}, 

Ti := {{0 < a; < 2}, {3 < a; < 4 }, {4 < a; < 5 }}, 

T2 := {{0 < a; < 1 },{! < a; < 2 }, {3 < a; < 5 }}. 

Then % ^ pfn(CPi, C), but T = i 3 ^r-,{To) G pf„(CPi, C). Also, T =.y^ T- 



4 Widening the Finite Powerset Domain 

If the domain refinement of the previous section is meant to be used for static 
analysis, then a key ingredient that is still missing is a systematic way of en- 
suring the termination of the analysis. In this section, we describe two widening 
strategies that rely on the existence of a widening V : D x D ^ D on the 
base-level abstract domain.^ We start by proposing a general specification of an 
extrapolation operator that lifts this ‘V’ operator to the powerset domain. 

^ In this and the following examples, a polyhedron P G CPn will be denoted by a 
corresponding finite set of linear equality and non-strict inequality constraints. 

^ If the base-level abstract domain D is finite or Noetherian, so that it is not necessarily 
endowed with an explicit widening operator, then a dummy widening can be obtained 
by considering the least upper bound operator ‘©’. 
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Definition 4. (The V-connected extrapolation heuristics.) A partial op- 
erator : pfn(_D,h)^ ^ h) is a V-connected extrapolation heuristics 

for i)p if, for all 81,82 € pfn(D,l-) such that 8\ Ihp 82, h^ {81,82) is defined 
and satisfies the following conditions: 

82^:,^ h^ {81, 82)-, (2) 

Vd G h^ {81 , 82) \ 82 : 3di G . di Ihv d; (3) 

Vd G h^ {81,82) n 82 : ((3di G S'! . di Ih d) ^ (3d'i G 8x . d'l Ihv d)) . (4) 

Informally, condition (2) ensures that the result is an upper approximation of 
82 in which every element covers at least one element of 82 (i-C., the heuristics 
cannot add elements that are unrelated to S'2); conditions (3) and (4) ensure 
that in the resulting set, each element covering an element of 8\ originates from 
an application of ‘V’ to a (possibly different) element of 8\. 

It is straightforward to construct an algorithm for computing a V-connected 
extrapolation heuristics for any given base-level widening ‘V’. 

Proposition 1. For all 81,82 G pfn(-D,h) such that 81 hp 82, let 

hj {81,82) := ^2 ©P di V d2 G D I di G 5i, d2 G ^2, di Ih d2 }) . 

Then ‘hj ’ is a V-connected extrapolation heuristics for Dp. 

For the finite powerset domain over CP„, lines 10-15 of the algorithm speci- 
fied in [7, Figure 8, page 773] provide an implementation of the heuristics ‘hp’ 
defined in Proposition 1, instantiated with the standard widening, ‘Vg’, on CP„. 

Example 2 . To see that the ‘h^’ defined in Proposition 1 is not a widening for 
(CP„)p, consider the strictly increasing sequence To Cp 7i Cp • • • in CPi defined 
by Tj := {P. I 0 ^ i 8: j }, where Vi := {x = i}, for f G N. Then, no matter 
what the specification for ‘V’ is, we obtain hp {Tj,Tj+i) = T)+i, for all j G N. 
Thus, the “widened” sequence is diverging. 



4.1 Powerset Widenings Using Egli-Milner Connectors 

Example 2 shows that, when computing hp {81,82), divergence is caused by those 
elements of 82 that cover none of the elements occurring in 81, i.e., when 81 Fem 
82- Thus, stabilization can be obtained by replacing 82 with 81 Hem 82, where 
‘ Hem’ is a connector for Dp. We therefore define a simple widening operator on 
the finite powerset domain that uses a connector to ensure termination. 

Definition 5. (The ‘emVp’ widening.) Let ‘hp ’ be a V-connected extrapola- 
tion heuristics and ‘Hem’ be a connector for Dp. Let also 81,82 G p{n{D,\~), 
where 81 Ihp 82. Then 



81 evVp 82 '■= hp {81, 82), where 82 



82, if 81 \- PM 82; 

81 EIem 82, otherwise. 




142 R. Bagnara, P.M. Hill, and E. Zaffanella 



Theorem 1. The emVp’ operator is a widening on Dp. 



Example 3. To illustrate the widening operator ‘emVp’ we consider the powerset 
domain (CPi)p, with the standard widening ‘Vg’ on CPi and the trivial con- 
nector ‘Wem’ returning the singleton poly-hull of its arguments. Consider the se- 
quence To Cp 7i Cp • • • of Example 2 and the widened sequence Uq QpUi Qp ■ ■ • 
where Uq = To and Ui = lAi-i EMVp(ZTi_i l±lp %), for each i > 0. When computing 
U\, the second argument of the widening is fYo Wp 7i =71. Note that Uo Pem 7i 
does not hold so that the connector is needed. Thus, we obtain 

Ui = WemTi) = /r^(zYo,{{0 < X < 1}}) = {{0 < x}}. 

In the next iteration we obtain stabilization. Clearly, in general the precision of 
this widening will depend on the chosen connector operator. 

For the polyhedral domain CP„, the powerset widening ‘emVp’ using the ‘/ip ’ 
heuristics defined in Proposition 1 is similar to but not quite the same as the 
operator sketched in [7]. As noted in that paper, the algorithm in [7, Figure 8, 
page 773] cannot ensure the termination of the analysis. To this end, instead 
of using a connector operator it is proposed that, when the cardinality of the 
abstract collection reaches a fixed threshold, a further poly-hull approximation 
be applied. However, there are examples indicating that such an approach can- 
not come with a termination guarantee when considering arbitrary increasing 
sequences [5]. 



4.2 Powerset Widenings Using Finite Convergence Certificates 

We now present another widening operator (denoted here by ‘/^Vp’) for the finite 
powerset domain. This requires that the operator ‘V’ defined on the base-level 
domain is provided with a (finitely computable) finite convergence certificate. 
Formally, a finite convergence certificate for ‘V’ (on D) is a triple 
where {O, )^) is a well-founded ordered set and p,: D ^ O, which is called level 
mapping, is such that, for all d\ Ih ^2 G D, p{di) >- p{d\ V d 2 ). We will abuse 
notation by writing p to denote the certificate {O, '^,p). 

Example 4- For the polyhedral domain CP„ and the standard widening ‘Vg’, we 
can define a certificate (Os,ys,Ps) where Os is the pair (N,N), Vg’ the lexico- 
graphic ordering of the pair using > for the individual ordering of the components 
and Ps'. CP„ — >■ Os the level mapping PsiV) = {n — dim(P),fc), where dim(P) 
is the dimension of V and k the minimal number of half-spaces needed to define 
V. Similarly, a certificate for the widening ‘V’ on CP„ proposed in [4] can be 
obtained by considering the level mapping pt : CP„ — >■ Ob induced by the limited 
growth ordering relation ‘o’ defined in [4], so that we have pb{Vi) >~b Pb{'P2) if 
and only if Vi r\V 2 - For lack of space, we refer the reader to [4]. 
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Given a certificate for ‘V’, we can define a suitable limited growth ordering 
relation ‘Op’ for the finite powerset domain that satisfies the ascending chain 
condition. 

Definition 6. (The ‘r>p’ relation.) The relation r\p C x p{^{D,\~) 

induced by the certificate yi for 'V’ is such that, for each 81,82 G pfn(D,h), 
81 r\p 82 if and only if either one of the following conditions holds: 

m(©^i) ^ a^(©^2); (5) 

m(©^i) = M©^2)a#^i>ia#^2 = 1; (6) 

m(©^i) = A^(©^2) A#^i>1A#^2>1A /x(^i) > f{ 82 ) (7) 

where, for each 8 G pfn(D,h), ft{8) denotes the multiset over O obtained by 
applying /i to each abstract element in 8. 

Proposition 2. The ‘ r\p ’ relation satisfies the ascending chain condition. 

Intuitively, the relation ‘Op’ will induce a certificate pLp\ Dp — >■ Op for the new 
widening. Namely, by defining p,p{8i) )^p /ip(S'2) if and only if 81 r\p 82, we 
will obtain /r.p(S'i) )^p p.p{8i ifi7p82). 

The specification of our “certificate-based widening” assumes the existence 
of a subtract operation for the base-level domain. It is expected that a specific 
subtraction would be provided for each domain; here we just indicate a minimal 
specification. 

Definition 7. (Subtraction.) The partial operator Q: D x D D is a sub- 
traction for D if, for all di,d2 G D such that d2 b di, we have c?i © ^2 b d\ and 
di = (di 0 ^ 2 ) ® c?2. 

A trivial subtraction operator can always be defined as d\ 0 ^2 := di- 

Example 5 . In CP„, the function diff : CP„ x CP„ — >■ CP„ is defined so that, for 
any V,Q G CP„, diff(P, Q) denotes the smallest closed and convex polyhedron 
containing the set difference V\Q. Then, if Q C P, we have diff(P, Q) QV and 

V={V\Q)yjQ = diff(P, Q) U Q = diff (P, Q) W Q, 

so that ‘diff’ is a subtraction. 

We can now define the certificate-based widening ‘^Vp’. 

Definition 8. (The ‘^Vp’ widening.) Let ‘r\p’ be the limited growth ordering 
induced by the certificate p, for ‘V ’ and let ‘fflp ’ be any upper bound operator on 
Dp. Let 81,82 G pfn(D,b) be such that 81 Ibp 82. Also, if ^81 lb 0 (S'ifflpS' 2 ), 
let d G D be defined as d := (©S'! V ©(-S'! fflp S'2)) © (©(S'! fflp *S'2))- Then 

( 81 fflp 82, if 81 r\p 81 fflp 82; 

81 ^Vp 52 := I {81 fflp ^2) ©P {d}, if © 5 i lb ^{81 fflp ^2); 

[{©S'2}, otherwise. 
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In the first case, we simply return the upper bound Si fflp S2, since this is enough 
to ensure a strict decrease in the level mapping. In the second case, the join of 
S'! is strictly more precise than the join of S'! fflp S' 2 , so that we apply ‘V’ to 
them and then, using the subtraction operator, improve the obtained result, 
since S'! r\p {Si fflp S' 2 ) 0p {d} holds. In the last case, since the join of Si fflp S 2 
is invariant, we return the singleton consisting of the join itself, as originally 
proposed in [11, Section 9]. 

Theorem 2. The '^Vp’ operator is a widening on Dp. 

Example 6. To illustrate the last two cases of Definition 8, consider the domain 
(CPi)p with the standard widening ‘Vg’ for CPi certified by the level mapping 
pLs defined in Example 4 and the upper bound operator ‘fflp’ defined as ‘0p’ so 
that Si fflp S 2 = S 2 always holds. 

Let Ti = {{0 < X < 1}} and T2 = {{0 < x < 1},{2 < x < 3}}. Then 
7i ^Ap T2, so that the condition for the first case in Definition 8 does not hold. 
The poly-hulls of 7i and T2 are {0 < x < 1} and {0 < x < 3}, respectively, 
so that the condition for the second case holds. Since l+j7iVgl+j72 = {0<x}, 
then by letting the polyhedron V be the element d as specified in Definition 8, 
we obtain V = diff({0 < x}, {0 < x < 3}) = {3 < x}, so that 

Ti itVpT^ = 7^ l±lp {V} = {{0 < X < 1}, {2 < X < 3}, {3 < x}}. 

Now let T3 = {{x = !},{x = 3}} and T4 = {{x = 1}, {x = 2},{x = 3}}. 
Then 7s Ap 71, so that the condition for the first case in Definition 8 does not 
hold. Moreover, 1+J73 = l+J74 = {l<x<3}, so that neither the second case 
applies. Thus, 71 ^VpTl = {{1 < x < 3}}. 

As shown in the example above. Definition 8 does not require that the up- 
per bound operator ‘fflp’ is based on the base-level widening ‘V’. Moreover, the 
scheme of Definition 8 can be easily extended to any finite set of heuristically 
chosen upper bound operators on Dp, still obtaining a proper widening oper- 
ator. The simplest heuristics, already used in the example above, is the one 
taking fflp := 0p. If this fails to ensure a decrease in the level mapping, an- 
other possibility is the adoption of a V-connected extrapolation heuristics ‘hp’ 
for Dp. Anyway, many variations could be defined, depending on the required 
precision/efficiency trade-off. In the following section, we investigate one of these 
possibilities, which originates as a generalization of an idea proposed in [7]. 

5 Merging Elements According to a Congruence Relation 

When computing a powerset widening S'! Vp A, no matter if it is based on an 
Egli-Milner connector or a finite convergence certificate, some of the elements 
occurring in the second argument S '2 can be merged together (i.e., joined) without 
affecting the finite convergence guarantee. This merging operation can be guided 
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by a congruence relation on the finite powerset domain Up, the idea being that a 
well-chosen relation will benefit the precision/efficiency trade-off of the widening. 

One option is to use semantics preserving congruence relations, i.e., refine- 
ments of the congruence relation ‘= 7 p’. The availability of relatively efficient but 
incomplete tests for semantic equivalence can thus be exploited to improve the 
efficiency and/or the precision of the analysis. As the purpose of this paper is to 
provide generic widening procedures for powersets that are independent of the 
underlying domains and hence, of any intended concretizations, here we define 
these congruences in a way that is independent of the particular concrete domain 
adopted. Two such relations are the identity congruence relation, where no non- 
trivial equivalence is assumed, and the (B-congruence relation, where sets that 
have the same join are equivalent. However, the identity congruence will have no 
influence on the convergence of the iteration sequence, while the ©-congruence 
is usually the basis of the default, roughest heuristics for ensuring termination. 
We now define a new congruence relation that lies between these extremes. 

Definition 9 . (‘o’ and ‘ixi’.) The content relation o C x pfn(D,h) is 

such that Si o S2 holds if and only if for all S[ € pfn(D,l-) where hp there 
exists S” G pfn(D,h) such that and S” hp 82- The same-content 

relation ixi C pf„(£),h) x pf„(D,h) is such that S\ c>o S2 holds if and only if 
S\ o S2 and 82 < Si. 

Observe that the identity congruence relation can be obtained by strengthening 
the conditions in the definition of ‘o’, replacing 0A( = 05/ with S[ = 5/; 
and the ©-congruence can be obtained by weakening the conditions, replacing 
S'{ hp 82 with 05/ h 052. Thus the same-content relation is a compromise 
between keeping all the information provided by the explicit set structure, as 
done by the identity congruence, and losing all of this information, as occurs 
with the ©-congruence. 

For the finite powerset domain of polyhedra (CP„)p, the content relation 
‘o’ corresponds to the condition that all the points in polyhedra in the first 
set are contained by polyhedra in the second set; and hence, the same-content 
congruence relation ‘co’ coincides with the induced congruence relation ‘=-,,a’. 

Proposition 3 . For all 81,82 G pfn(CP„, C), 5 i co 82 if and only if 81 =^a 82 - 

Example 1. For 7i,72 G pfn(CPi, C) as defined in Example 1, we have 71 txi 72- 
Consider also 73,7/ G pfn(CPi, C) where 

% ■■= {{0 < a; < 3},{1 < X < 5}}, 7/ := {{0 < x < 5}}. 

Then Ts ^ Ta and also 7/ < 7/ although the converse does not hold. To see 
this, let 5i, 82, and 5^ in Definition 9 be 7/, 7/, and Tf := {{x = 2.5}}, 
respectively. Then, if T" is such that 1+J 7/" = 1+J T[ and 7/" Cp T[, we must have 
T" = Tf ^p 7/; hence, although 7/ = {I+JT/}, we have 7/ T/- 

We now define an operation merger that is parametric with respect to the 
congruence relation and replaces selected subsets by congruent singleton sets. 
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Fig. 1. Merging polyhedra according to ‘cx]’. 



Definition 10. (Merge and mergers.) Let R be a congruence relation on 
Dp. Then the merge relation merge^ C x pfn(I?,l-) for R is such that 

merge (^i, 5'2) holds if and only if Si hp S 2 and 

W 2 e ^2 : ^s[ CSi.d2 = © 5 ; A {^ 2 } R S[. 

A set S € pfn{D,\-) is fully-merged for R, if merge S') implies S = S'; S is 
pairwise-merged for R if for all di, c ?2 G S, we have that {di, c? 2 } is fully-merged. 
An operator ti?: pfn(D,l-) — >■ pfn(D,h) is a merger for R if merge ^{S,fR S) 
holds. 

Note that, for all S € pfn(D, h) and congruence relations R, we have S Pem tn S. 

For the finite powerset domain over CP^, lines 1-9 of the algorithm specified 
in [7, Figure 8, page 773] define a merger operator ‘ttx’ such that, for each finite 
set S of polyhedra, fx S is pairwise-merged. 

Example 8. Figure 1 shows two examples of sets of polyhedra. In the left-hand 
diagram, the set T = {Vi,'P 2 ,'P^} of three squares is not pairwise-merged for 
‘ixi’ since Vi U V 2 and V 2 U Vz are convex polyhedra. Both 7i = {Vi U 7^2, 
and ?2 = {7^1, 7^2 U T^a} are fully-merged and hence pairwise-merged for ‘[xi’, 
and merge,^(T, 7i) holds for i = 1, 2. In the right-hand diagram, the set 
T' = {Qi, Q 2 , Qa, Q 4 , Q 5 } is pairwise-merged but not fully-merged for ‘ixi’. Since 
Q' := IJT' is a convex polyhedron, the singleton set {Q!} is fully-merged and 
hence pairwise-merged for ‘ixi’ and merge^(T', {Q'}) holds. 

6 Conclusion 

We have studied the problem of endowing any abstract domain obtained by 
means of the finite powerset construction with a provably correct widening oper- 
ator. We have proposed two generic widening operators and we have instantiated 
our techniques, which are completely general, on powersets of convex polyhedra, 
an abstract domain that is being used for static analysis and abstract model- 
checking and for which no non-trivial widening operator was previously known. 
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We have extended the Parma Polyhedra Library (PPL) [3,6], a modern C++ 
library for the manipulation of convex polyhedra, with a prototype implemen- 
tation of the widenings and their variants employing the ‘widening up to’ tech- 
nique [20,21]. The experimental work has just started, but the initial results 
obtained are very encouraging as our new widenings compare favorably, both in 
terms of precision and efficiency, with the extrapolation operator of [7]. 
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Abstract. We study the type system introduced by Boyapati and Ri- 
nard in their paper “A Parameterized Type System for Race-Free Java 
Programs” and try to infer the type annotations (“lock types”) needed 
by their type checker to show that a program is free of race conditions. 
Boyapati and Rinard automatically generate some of these annotations 
using default types and static inference of lock types for local variables, 
but in practice, the programmer still needs to annotate on the order of 1 
in every 25 lines of code. We use run-time techniques, based on the lock- 
set algorithm, in conjunction with some static analysis to automatically 
infer most or all of the annotations. 



1 Introduction 

Type systems are well established as an effective technique for ensuring at 
compile-time that programs are free from a wide variety of errors. New type 
systems are being developed by researchers at an alarming rate. Many of them 
are very elaborate and expressive. 

Types provide valuable compile-time guarantees, but at a cost: the program- 
mer must annotate the program with types. Annotating new code can be a 
significant burden on programmers. Annotating legacy code is a much greater 
burden, because of the vast quantity of legacy code, and because a program- 
mer might need to spend a long time studying the legacy code before he or she 
understands the code well enough to annotate it. 

Type inference reduces this burden by automatically determining types for 
some or all parts of the program. A type inference mechanism is complete if it 
can infer types for all typable programs. 

Traditional type inference is based on static analysis. A common approach is 
constraint-based type inference, which works by constructing a system of con- 
straints (of appropriate forms) that express relationships between the types of 
different parts of the program and then solving the resulting constraints. 

Unfortunately, complete type inference is impossible or infeasible for many 
expressive type systems. This motivates the development of incomplete type in- 
ference algorithms. These algorithms fall on a spectrum that embodies a trade-off 

* This work was supported in part by NSF under Grant CCR-9876058 and 
ONR under Grants N00014-01- 1-0109 and N00014-02-1-0363. Authors’ Email: 
{ragarwal, stollerjScs . sunysb. edu Web: 
http : //www. cs . sunysb. edu/~{r agarwal, stoller} 



B. Steffen and G. Levi (Eds.): VMCAI 2004, LNCS 2937, pp. 149-160, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 




150 R. Agarwal and S.D. Stoller 



between computational cost and power. Roughly speaking, we measure an algo- 
rithm’s power by how many annotations the user must supply in order for the 
algorithm to succesfully infer the remaining types. For some important type sys- 
tems, even incomplete algorithms designed to infer most types for most programs 
encountered in practice may have prohibitive exponential time complexity. 

We have developed a new run-time approach to type inference that, for some 
type systems, appears to be more effective in practice than traditional static type 
inference. We monitor some executions of the program and infer (one might say 
“guess”) candidate types based on the observed behavior. 

A premise underlying this work is that data from a small number of simple 
executions is sufficient to infer most or all of the types. In particular, it is not 
necessary for the monitored executions collectively to achieve — or even come 
close to — full statement coverage in order to support successful inference of types 
for the entire program. Another premise is that the type inference is relatively 
insensitive to the choice of test inputs. Experience with Daikon [ErnOO] supports 
this idea. 

This approach has some obvious theoretical limitations. It is not complete, 
because the process of generalizing from relationships between specific objects 
in a particular execution to static relationships between expressions or state- 
ments in the program is based in part on (incomplete) heuristics. Run-time type 
inference is unsound (z.e., the inferred types might not satisfy the type check- 
ing rules), because the observed behavior is not necessarily characteristic of all 
possible behaviors of the program, and correct types express properties that 
should hold for all possible behaviors of the program. We transform this un- 
soundness into incompleteness by running the type-checker after type inference. 
If the type-checker signals an error, we report that type inference failed. This is 
not a problem in practice, provided it is rare. 

Despite these theoretical limitations, from a pragmatic point of view, the 
most important question for a given type system is whether run-time type infer- 
ence, traditional static type inference, or a combination of them will provide the 
greatest overall reduction of users’ annotation burden. As a first step towards 
the empirical evaluation of run-time type inference, we developed and imple- 
mented a run-time type inference algorithm for a recently proposed type system 
for concurrent programs. 

Concurrent programs are notorious for containing errors that are difficult 
to reproduce and diagnose at run-time. This inspired the development of type 
and effect systems (for brevity, we will call them “type systems” hereafter) that 
statically ensure the absence of some common kinds of concurrent programming 
errors. Flanagan and Freund [FFOO] developed a type system that ensures that 
a Java program is race-free, i.e., contains no race conditions; a race condition 
occurs when two threads concurrently access a shared variable and at least one 
of the accesses is a write. The resulting programming language (i.e., Java with 
their extensions to the type system) is called Race Free Java. Boyapati and 
Rinard [BROl] modified and extended Flanagan and Freund’s type system to 
make it more expressive. The resulting programming language is called Param- 
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eterized Race Free Java (PRFJ). Hereafter, we assume that programs contain 
all type information required by the standard Java type-checker, and we use the 
word “types” to refer only to the additional type information required by these 
extended type systems. 

These type systems encode programming patterns that experienced program- 
mers often use to avoid these errors. Specifically, these type systems track the use 
of locks to “protect” (z.e., prevent concurrent access to) shared data structures. 
Although these type systems are occasionally undesirably restrictive, experience 
indicates that they are sufficiently expressive for many programs. 

The cost of expressiveness is that type inference for these type systems is 
difficult. In Flanagan and Freund’s experiments with three medium or large pro- 
grams, well-chosen defaults, in combination with some potentially unsafe (but 
usually safe in practice) “escapes” from the type system, reduce the number of 
annotations needed to about 12 annotations/KLOC (KLOC denotes “thousand 
lines of code” ) on average [FFOO] . For a fair comparison with type inference for 
PRFJ (discussed below), note that PRFJ programs typically require more an- 
notations than this, primarily because the PRF J type system is more expressive 
and does not rely on potentially unsafe escapes. 

Flanagan and Freund subsequently developed a simple type inference algo- 
rithm for their type system. Roughly speaking, it starts with a set of candidate 
types for each expression, runs the type checker, deletes some of the candidate 
types based on the errors (if any) reported by the type checker, and repeats this 
process until the type checker reports no errors [FFOl]. However, this algorithm 
infers types only for a restricted version of the type system — specifically, a ver- 
sion without external locks. Elimination of external locks significantly reduces 
the expressiveness of the type system. 

Boyapati and Rinard [BROl] use carefully-chosen defaults and local type in- 
ference to reduce the annotation burden on users of PRFJ. The user provides 
type annotations on selected declarations of classes, fields, and methods, and 
selected object allocation sites (z.e., calls to new). Default types are used where 
annotations are omitted. A simple intra-procedural constraint-based type in- 
ference algorithm is used to infer types for local variables of each method. In 
their experiments with several small programs, users needed to supply about 25 
annotations/KLOC [BROl]. 

We believe that type systems like PRF J are a promising practical approach to 
verification of race-freedom for programs that use locks for synchronization, ex- 
cept that the annotation burden is currently too high. We attempted to develop 
static inter-procedural type inference algorithms for PRFJ, using abstract inter- 
pretation [CC77] and constraint-based analysis {e.g., instantiation constraints 
[AKC02]), but the analysis algorithms were computationally expensive, and we 
did not believe they would scale to large programs. 

This motivated us to develop and implement a run-time type inference algo- 
rithm for PRFJ. The program is instrumented by an automatic source-to-source 
transformation. The instrumented program writes relevant information (mainly 
information about which locks are held when various objects are accessed) to 
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a log file. Analysis of the log, together with a simple static program analysis 
that identifies unique pointers [AKC02], produces type annotations at selected 
program points. Boyapati and Rinard’s simple intra-procedural type inference 
algorithm is then used to propagate the resulting types to other program points. 
This has the crucial effect of propagating type information into branches of the 
program that were not exercised in the monitored executions. Our experience, 
reported in detail in Section 6, is that run-time type inference provides a signif- 
icant further reduction in the annotation burden. 

If multiple types for (say) a field declaration are consistent with the logged 
information, we use heuristics to prioritize them. If the candidate type with the 
highest priority is rejected by the type checker (because it is inconsistent with 
the types at other program points), we try the next one. In our experiments so 
far, the highest-priori ty choice has always worked, so this iterative approach has 
not been needed (or implemented). 

Future Work. In the short term, we plan to implement a type checker for PRFJ; 
our current lack of a type checker limited our experiments to programs small 
enough for us to type-check manually. The run-time overhead of our instru- 
mentation is already moderate (see Section 6) but could perhaps be reduced 
by incorporating ideas from [vPGOl]. In the longer term, we are investigating 
type inference for race-free variants of C, such as [Gro03]. We expect run-time 
type inference to be even more beneficial in this context. Boyapati and Rinard’s 
defaults in PRFJ are very effective, in part because Java provides built-in locks, 
so this is a very good guess at the identity of the protecting lock in many cases. 
Devising equally effective defaults for variants of G seems difficult. 

2 Related Work 

Run-time type inference is very similar in spirit to Daikon [ErnOO]. During ex- 
ecution of a program, Daikon evaluates a large syntactic class of predicates at 
specified program points and determines, for each of those program points, the 
subset of those predicates that always held at that program point during the 
monitored executions. Among those predicates, those that satisfy some addi- 
tional criteria are reported as candidate invariants. Daikon cannot infer PRFJ 
types, because the invariants expressed by PRFJ types are not expressible in 
Daikon’s language for predicates. Daikon infers predicates that can be evalu- 
ated at a single program point. In contrast, a single PRFJ type annotation can 
express an invariant that applies to many program points. For example, if the 
declaration of a field / in a class C is annotated with the PRFJ type self, it 
means (roughly): for all instances o of class C, for all objects o' ever stored in 
field / of o, o' is protected by its own lock, i.e., the built-in lock associated (by 
the Java language semantics) with o' is held whenever any field of o' is accessed. 
Such accesses may occur throughout the program. 

Naik and Palsberg developed an expressive type system for ensuring that an 
interrupt-driven program will handle every interrupt within a specified deadline 




Type Inference for Parameterized Race-Free Java 



153 



[NP03]. Their type system is equivalent to model checking, in the sense that 
a program is typable exactly when their model checker, applied to a specified 
abstraction of the program, verifies that all deadlines are met. Due to this close 
connection, types for a program can be inferred from the output of the model 
checker. Our goal is quite different than theirs: we aim to show that inexpensive 
run-time techniques (in contrast to relatively expensive model checking) can 
provide an effective basis for type inference. 

Static analyses such as meta-compilation [HCXE02] and type qualifiers 
[FTA02] have been used to check or verify simple lock-related properties of con- 
current programs, e.g., that a lock is not acquired twice by the same thread 
without an intervening release. Such analyses cannot easily be used to check 
more difficult properties such as race-freedom. 



3 Overview of Parameterized Race Free Java (PRFJ) 

The PRFJ type system is based on the concept of object ownership. Fach object 
is associated with an owner which is specified as part of the type of the variables 
that refer to that object. Each object is owned by another object, or by special 
values thisThread, self, unique or readonly. Since an object can be owned by 
another object which in turn could be owned by another object, the ownership 
relation can be regarded as a forest of rooted trees, where the roots may have self 
loops. Ownership information expresses a synchronization discipline: to safely 
access an object o, a thread must hold the lock associated with the root r of the 
ownership tree containing o; r is called o’s root owner. 

An object with root owner thisThread is unshared. Such objects can be ac- 
cessed without synchronization. This is reflected in the type system by declaring 
that every thread implicitly holds the lock associated with thisThread. An ob- 
ject with owner self is simply owned by itself. If an object o has owner unique, 
there is a single (unique) reference to o. Only the thread currently holding that 
reference can access o, so there is no possibility of race conditions involving o, 
and no lock needs to be held when accessing o. An object with owner readonly 
cannot be updated and can be accessed without any locks. 

Every class in PRFJ is parameterized with one or more parameters. Param- 
eterization allows the programmer to specify appropriate ownership information 
separately for each use of the class. The first parameter always specifies the 
owner of the this object. The remaining parameters, if any, may specify the 
owners of fields or parameters or return values of methods. The first parameter 
of a class can be a formal owner parameter or one of the special values discussed 
above; the remaining parameters must be formal owner parameters. When the 
class is used in the program, its formal owner parameters are instantiated with 
final expressions or the above special values. Final expressions are expressions 
whose value does not change; using them to represent owners ensures that an 
object’s owner does not change from one object to another. Syntactically, final 
expressions are built from final variables, including the implicit this variable, 
final fields, and static final fields. Ownership changes that do not lead to race 
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public class MyThread<thisThread> extends Thread<thisThread> { 

public ArrayList<self ,readonly> Is; 

public MyThread(ArrayList<self ,readonly> Is) ■[ 
this. Is = Is; 

} 

public void run() { 

synchronizedCthis . Is) { ls.add(new Integer<readonly>(10) ) ; }■ 

} 

public static void main(String args [] ) { 

ArrayList<self ,readonly> Is = new ArrayList<self ,readonly>() ; 
MyThread<unique> ml = new MyThread<unique> (Is) ; 
MyThread<unique> m2 = new MyThread<unique> (Is) ; 

ml — . start 0 ; 
m2 — . start 0 ; 

} 



Fig. 1. A Sample PRFJ Program. 



conditions are allowed; for example, an object’s owner may change from unique 
to any other owner. 

Every method is annotated with a clause of the form “requires ei, . . . , Cn, 
where the are final expressions. Locks on the root owners of the objects listed 
in the requires clause must be held at each call site. 

The type checking rules ensure that in a well-typed program, an object that 
is not readonly can be accessed only by a thread that either holds the lock on 
the root owner of the object or has a unique reference to the object. This implies 
that the program is race-free. 

To illustrate the PRFJ system, consider the program in Figure 1. The defini- 
tion of class ArrayList is not shown, but it has two owner parameters: the first 
specifies the owner of the ArrayList itself, and the second specifies the owner of 
the objects stored in the ArrayList. The My Thread constructor returns a unique 
reference to the newly allocated object. Thus, the main thread has unique refer- 
ences to the two instances of My Thread until they are started. After an instance of 
Mythread is started, it is accessed by only one thread (namely, itself) and hence 
is unshared. Thus, the owner of each My Thread object changes from unique to 
thisThread. The occurrences of — in the main method indicate that the main 
thread relinquishes its unique references to ml and m2 when it starts them. These 
occurrences of — are required by the type checking rules, and we consider them 
to be, in effect, type annotations. 
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The two instances of My Thread share a single ArrayList object a. The lock 
associated with a is held at every access to a, so a has owner self, and the first 
parameter of ArrayList is instantiated with self. Instances of the Integer 
class are immutable, so they have owner readonly. All objects stored in a have 
owner readonly, so the second owner parameter of ArrayList is instantiated 
with readonly. 

PRFJ defaults are unable to determine the unique, self, and readonly 
owners used in this program. Our type-inference algorithm described in Section 
4 infers all of the types correctly for this program. [AS03] shows in detail how 
our algorithm works for this program. 



4 Type Inference for PRFJ 

Our algorithm has three main steps. 

First, the static analysis in [AKC02] is used to infer unique and !e (“ not 
escaping”) annotations for fields, method parameters, return values, and local 
variables (PRFJ’s !e annotation corresponds to the lent annotation in [AKC02]). 
We use static analysis for this because it is usually adequate and because run- 
time determination of which objects have unique references would be expensive. 

Second, run-time information is used to infer owners for fields, method pa- 
rameters and return values. Owners in class declarations are inferred next. For 
a class whose first owner is inferred to be a constant (z.e., anything other than 
a formal owner parameter), all occurrences of that class in the program are 
instantiated with that constant as the first owner. 

Third, the intra-procedural type inference algorithm in [BROl, Section 7.1] is 
applied, to infer the types of local variables whose types have not already been 
determined. 

Few classes need multiple owner parameters, and most of the classes that 
do are library classes, which can be annotated once and re-used, so we do not 
attempt to infer which classes C need multiple owner parameters or how those 
parameters should be used in the declarations of fields and methods of C. We 
assume this information is given. We do try to infer how to instantiate those 
owner parameters in all uses of C. 

4.1 Inferring Unique Owners 

The static uniqueness analysis in [AKC02] is a fairly straightforward flow- 
sensitive context-insensitive inter-procedural data-flow analysis whose running 
time is linear in the size of the program. For details, see [AKC02]. 

4.2 Inferring Owners for Fields, Method Parameters, and Return 
Values 



Let X denote a field, method parameter, or method return type with reference 
type (z.e., not a base type). To infer the owner of x (z.e., the first-owner in the 
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type of x), we monitor accesses to a set S{x) of objects associated with x. If x 
is a field of some class C, S{x) contains objects stored in the / field of instances 
of C . For a method parameter x, S{x) contains arguments passed through that 
parameter. For a method return type x, S{x) contains objects returned from 
the method. Let FE(x) denote the set of final expressions that are syntactically 
legal (z.e., in scope) at the program point where x is declared. 

After an object o is added to S{x), every access (read or write) to o is 
intercepted and some information is recorded. Specifically, at the end of run- 
time monitoring, the following information is available for each object o in S{x): 
lkSet(a;, o), the set of locks that were held at every access to o after o was inserted 
in S{x) [SBN+97]; rdOnly(x, o), a boolean that is true iff no field of o was written 
(updated) after o was inserted in S{x); shar(cc, o), a boolean that is true iff o is 
“shared”, i.e., multiple threads accessed non-final fields of o after o was inserted 
in S{x); val(a;, o, e), the value of final expression e at an appropriate point for x 
and o, for each e € FE(x). If a; is a field, the appropriate point is immediately 
after the constructor invocation that initialized o. If a; is a parameter of a method 
m, the appropriate point is immediately before calls to m at which o is passed 
through parameter a;. If a; is a return type of a method m, the appropriate point 
is immediately after calls to m at which o is passed as the return value of m. 

The owner of x is determined by the first applicable rule below. 

1. If the Java type of x is an immutable class (e.g., String or Integer), then 
owner (x)=readonly. 

2. If (Vo € S(x) : ->shar(a;, o)), then owner(a;)=thisThread. 

3. If (Vo G <S'(x) : rdOnly(a;, o)), then owner(a;)=readonly. 

4. If (Vo G S(x) : o G lkSet(a;, o)), then owner(a:)=self . 

5. Let E(x) be the set of final expressions e in FE(a;) such that, for each object o 
in S'(a;), val(a;, o, e) is a lock that protects o; that is, E(x) = {e G FE(x) | Vo G 
S(x) : val(x,o,e) G lkSet(x, o)}. If B(x) is non-empty, take owner(x) to be 
an arbitrary element of E{x). 

6. Take owner(a;) to be a formal owner parameter of the class containing a;, 
normally this Owner. ^ 

To reduce the run-time overhead, we restrict S{x) to contain only selected 
objects associated with x. This typically does not affect the inferred types. We 
currently use the following heuristics to restrict S{x). For a field x with type 
C, S{x) contains at most one object created at each allocation site for C. For a 
method parameter or return type x, S{x) contains at most one object per call 
site of that method. Also, we restrict FE(a:) to contain only the values of final 
expressions of the form this or this./, where / is a final field. 

4.3 Inferring Values of Non-first Owner Parameters 

If the Java type of a: is a class C with multiple owner parameters, for each 
formal owner parameter P of C other than the first, we need to infer ownerp(cc), 

^ This rule is not needed for the examples in Section 6. 
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the value with which P should be instantiated for x. Let Sp{x) denote a set of 
objects o' associated with P for x and such that P denotes the first owner of o'. In 
particular, for each o in S{x): (1) for each field f of C declared with P as the first 
owner of / (z.e., class C<...,P,...> { ... D<P> f ; ... }), objects stored in 

o.f are added to Sp{x)] (2) for each parameter p of a method m of C such that the 
first owner of p is P (z.e., class C<...,P,...> { ... m(. . . ,D<P> p, . . .) ... 
}), add to Sp{x) arguments passed through parameter p when o.m is invoked; 
(3) for each method to of C whose return type has first owner P, add to Sp{x) 
objects returned from invocations of o.m. We instrument the program to monitor 
accesses to objects in Sp{x) and infer an owner based on that, just as in Section 
4.2. As an optimization, we may restrict Sp{x) to contain a subset of the objects 
described above. 



4.4 Inferring Owners in Class Declarations 

Let owner(C) denote the first owner in the declaration of class C {i.e., it denotes 
o in class C<o, . . .>). Let S{C) contain instances of C. We monitor accesses to 
elements of S{C) as in Section 4.2 and then use the following rules to determine 
owner(C'). 

1. If C is a subclass of a class C" with owner(C")=self , then owner(C')=self . 

2. If S{C) = 0 {i.e., there are no instances of C), then owner(C')=thisOwner.^ 

3. If (Vo G S{C) : -ishar(a;, o)), then owner(C)=thisThread. 

4. If (Vo G S : o G lkSet(a;, o)), then owner(C)=self . 

5. Take owner(C')=thisOwner. 

For efficiency, we restrict S{C) to contain only a few instances of C. Currently, 
we arbitrarily pick two fields or method parameters or return values of type C 
and take S{C) to contain the objects stored in or passed through them. 



4.5 Inferring requires Clauses 

We infer requires clauses basically as in [BROl], except we use run-time moni- 
toring instead of user input to determine which classes C have owner thisThread 
{i.e., each instance of C is accessed by a single thread). 

Each method declared in each class with owner thisThread is given an empty 
requires clause. For each method in each other class, the requires clause 
contains all method parameters p (including the implicit this parameter) such 
that TO contains a field access p.f (for some field /) outside the scope of a 
synchronized (p) statement; as an exception, the runO method of classes that 
implement Runnable is given an empty requires clause, because a new thread 
holds no locks. 

^ For example, in many programs, the class containing the main method is never 
instantiated. 
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4.6 Static Intra-procedural Type Inference 

The last step is to infer the types of local variables whose types have not al- 
ready been determined, using the intra-procedural type inference algorithm in 
[BROl, Section 7.1]. Each incomplete type {i.e., each type for which the values 
of some owner parameters are undetermined) is filled out with an appropriate 
number of fresh distinct owner parameters. Equality constraints between owners 
are constructed in a straightforward way from each assignment statement and 
method invocation. The constraints are solved in almost linear time using the 
standard union-find algorithm. For each of the resulting equivalence classes E, if 
E contains one known owner o {i.e., an owner other than the fresh parameters), 
then replace the fresh owner parameters in E with o. If E contains multiple 
known owners, then report failure. If E contains only fresh owner parameters, 
then replace them with thisThread. This heuristic is adequate for the examples 
we have seen, but if necessary, we could instrument the program to obtain run- 
time information about objects stored in those local variables, and then infer 
their owners as in Section 4.2. 

5 Implementation 

This section describes the source-to-source transformation that instruments a 
program so that it will record the information needed by the type inference 
algrithm in Section 4. The transformation is parameterized by the set of classes 
for which types should be inferred. 

All instances of Thread are replaced with ThreadwithLockSet, a new class 
that extends Thread and declares a field locksHeld. Synchronized statements 
and synchronized methods are instrumented to update locksHeld appropriately; 
a try/finally statement is used in the instrumentation to ensure that excep- 
tions are handled correctly. For each field, method parameter and return type 
X being monitored, a distinct IdentityHashMap is added to the source code. 
The hashmap for x is a map from objects o in S{x) to the information recorded 
for o, as described in Section 4, except the lockset. We store all locksets in a 
single IdentityHashMap. Thus, even if an object o appears in S{x) for multiple 
X, we maintain a single lockset for o. Object allocation sites, method invocation 
sites, and field accesses are instrumented to update the hashmaps appropriately. 
Which sites and expressions are instrumented depends on the set of classes spec- 
ified by the user. 

6 Experience 

To evaluate the inference engine, we ran our system on the five multi-threaded 
server programs used in [BROlj. C. Boyapati kindly sent us the annotated PRFJ 
code for these servers. The programs are small, ranging from about 100 to 600 
lines of code. We compare the number of annotations in that code — this is also 
the number of annotations needed for the mechanisms in [BROl] to successfully 
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infer the remaining types — with the number of annotations needed with our type- 
inference algorithm. Recall that our type-inference algorithm is also incomplete 
and hence might be unable to infer some types. In these experiments, we infer 
types only for the application (z.e., server) code; we assume PRFJ types are 
given for Java API classes used by the servers. 

In summary, our type-inference mechanism successfully inferred complete and 
correct typings for all five server programs, with no user-supplied annotations. 
Also, slowdown due to the instrumentations was typically about 20% or less. 

Four of the server programs did not come with clients, so we wrote very 
simple clients for them. One server program (PhoneServer) came with a simple 
client. We modified the servers slightly so they terminate after processing one 
or two requests (in our current implementation, termination triggers writing of 
collected information to the log) . A single execution of each server with its simple 
client provided enough information for successful run-time type inference. This 
supports our conjecture (in Section 1) that data from a small number of simple 
executions is sufficient. 

The original PRFJ code for GameServer requires 7 annotations. Our algo- 
rithm infers types for the program with no annotations. We wrote a simple client 
with two threads. This program is sensitive to the scheduling of threads. Different 
interleavings of the threads caused different branches to be taken, leaving other 
branches in the code unexercised in that execution. We applied our run-time 
type inference algorithm to each of the possible executions, and it sucessfully 
inferred the types in every case. 

The original ChatServer code contains 13 annotations. We wrote a simple 
client with two threads. One class (read_from_coiinection) in the server pro- 
gram has only final fields, and the class declaration can correctly be typed with 
owner self or thisThread. The original code contains a manual annotation of 
self, while our algorithm infers thisThread. The ChatServer, like the other 
servers, uses Boyapati’s modified versions of Java API classes {e.g., Vector), 
from which synchronization has been removed. The benefit is that synchroniza- 
tion can be omitted in contexts where it is not needed; the downside is that, 
when synchronization is necessary, it must be included explicitly in the appli- 
cation code. We also considered a variant of this server that uses unmodified 
Java library classes. Our algorithm infers a complete and correct typing for both 
variants, with no assistance from the user. 

The original QuoteServer code contains 15 annotations, PhoneServer code 
contains 12 annotations across 4 classes and HTTPServer contains 23 annota- 
tions across 7 classes. Our algorithm infers types for each of these programs 
with no annotations. Most of the annotations in QuoteServer were unique an- 
notations, and static analysis of uniqueness is able to infer the annotations. For 
the HTTPServer program the inferred types are not exactly the same as those 
in the original code, as was the case with ChatServer. 

Encouraged by these initial results, we plan to apply our system to much 
larger examples as soon as we have implemented a type-checker for PRFJ. 
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Abstract. We investigate the certification of temporal properties of untrusted 
code. This kind of certification has many potential applications, including high 
confidence extension of operating system kernels. The size of a traditional, proof- 
based certificate tends to expand drastically because of the state explosion prob- 
lem. Abstraction-carrying Code (ACC) obtains smaller certificates at the expense 
of an increased verification time. In addition, a single ACC certificate may be 
used to certify multiple properties. ACC uses an abstract interpretation of the mo- 
bile program as a certificate. A client receiving the code and the certificate will 
first validate the abstraction and then run a model checker to verify the temporal 
property. 

We have developed ACCEPT/C, a certifier of reachability properties for an in- 
termediate language program compiled from C source code, demonstrating the 
practicality of ACC. Novel aspects of our implementation include: 1) the use of 
a Boolean program as a certificate; 2) the preservation of Boolean program ab- 
straction during compilation; 3) the encoding of the Boolean program as program 
assertions in the intermediate program; and 4) the semantics-based validation of 
the Boolean program via a verification condition generator (VCGen). Our expe- 
rience of applying ACCEPT/C to real programs, including Linux and NT drivers, 
shows a significant reduction in certificate size compared to other techniques of 
similar expressive power; the time spent on model checking is reasonable. 



1 Introduction 

Proof-carrying code (PCC) techniques address the prohlem of how a client trusts a mo- 
bile (typically low-level) program by verifying a mathematical proof for certain safety 
properties [18]. Specifically, a server applies a safety policy to the program to generate 
verification conditions (VCs) and certifies the safety property with a checkable proof 
of the VCs. When a client generates the same set of VCs and checks the proof, PCC 
guarantees the client the safety of that program. Early variants of PCC, including TAL 
[17,16], ECC [13], Certified Binary [25], EPCC [1], Touchstone [18] and others, fo- 
cus primarily on the verification of type safety and memory safety, properties easily 
expressed in first-order logic. 

This paper investigates the certification of general temporal properties. Temporal 
logic [14] can express powerful safety policies, such as the correct use of APIs [2,8], 
or liveness requirements. Such properties enable an operating system kernel to start a 
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service developed by an untrusted party with high confidence that critical invariants, 
such as correct locking discipline, will not be compromised. 

Other researchers are also extending the PCC framework to address temporal proper- 
ties. In most of the proposed systems, a certificate is, as in PCC, a proof of the temporal 
property. Whether this proof is worked out using a theorem prover, as in the case of 
Temporal PCC (TPCC) [4], or translated from the computation of a model checker [26, 
11,12], the size of the proof tends to be large due to state explosion. In some cases a 
proof is several times larger than the program size even for trivial properties [4] . 

We are exploring Abstraction-carrying code (ACC), where an abstract interpretation 
of the program is sent as a certihcate. Our specific strategy is to adopt refinement based 
[5,20] predicate abstraction techniques [2,21,7,6,3] as the certihcation method. The 
abstraction model that is successfully model checked certifies the property of concern. 
In this paper, we use Boolean programs [2] as the abstract models. 

Boolean programs are often computed for source programs. To certify mobile pro- 
grams communicated in an intermediate language, we compile the source code into the 
intermediate language while preserving the correspondence with the abstract model. The 
intermediate program, the predicates and the abstract model are sent to the client. The 
client generates verification conditions to validate the abstract model and then model 
checks the abstract model to verify the temporal property. 

To investigate the feasibility of ACC, we have constructed the ACC Evaluation Pro- 
totype Toolkit for C (ACCEPT/C). This toolkit, while not yet a full implementation of 
ACC, contains a certifying compiler. ACCEPT/C enables measurement of the size of 
certihcates and the computational resources for certificate generation, validation and 
re-verification. ACCEPT/C is built on top of BLAST [8] and CIL [15]. The implemen- 
tation exploits existing capabilities of BLAST and handles additional technical issues. 
In particular: 

- ACCEPT/C computes a Boolean program from BLAST’S intermediate result, where 
a search tree is generated without an explicit concept of a static abstract model that 
resembles a Boolean program. 

- ACCEPT/C compiles the Control Flow Automaton (CEA) representation of the 
source code to an intermediate program for which the Boolean program is still 
a valid abstraction. 

- ACCEPT/C encodes the Boolean program as program assertions in the compiled 
program. 

- On the client side, the verification condition generator (VCGen) produces a set of 
conditions that can be discharged by an automatic theorem prover. 

Figure 1 illustrates the relationship between major data artifacts used by ACCEPT/C. 
To evaluate ACC, we are applying ACCEPT/C to Linux and NT drivers and other 
public domain programs. For API-compliance properties, most test cases yield a small 
certificate, acceptable model validation time and a surprisingly small model checking 
overhead. Initial investigation of a public domain implementation of an SSH server 
supports our conjecture that a single, yet small, ACC certificate can establish multiple 
properties. 

An alternative approach is the Model-carrying Code (MCC) framework [23,24]. In 
MCC the model is generated using statistical techniques based on server side observa- 
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Fig. 1. System overview of ACCEPT/C 



tions. The model is validated at run time by the client, though the verification of property 
is done statically. It does not require the client and the server to share details of the secu- 
rity property being verified. However, without such knowledge, it is sometimes hard to 
compute a model that is good enough to verify complicated properties. And the dynamic 
validation of the model introduces further workload for a client. 

The paper is organized as follows. In Section 2, we introduce the technical back- 
ground of ACC. Section 3 introduces how we compute Boolean programs from BLAST’S 
lazy abstraction results. Section 4 presents the abstraction-preserving compilation frame- 
work. Section 5 describes the encoding of Boolean programs as program assertions. 
Section 6 introduces the verification condition generator (VCGen). Section 7 reports our 
case studies with ACCEPT/C. We assume the readers have some knowledge of predicate 
abstraction. One example drawn from SLAM is used throughout the paper. 



2 Background 

This section reviews control flow automata (CFA), a mathematical abstraction used in 
BLAST [8], and adapts the formalism to models of intermediate representations. These 
results establish that ACC is expressive over next operator-free Linear Temporal Logic 
(LTL) formulas built over predicates on the system state. Preliminary accounts of these 
results are in our earlier workshop papers on ACC [29,28]. 



2.1 Control Flow Automata 

Henzinger and colleagues in the BLAST (Berkeley Lazy Abstraction Software Veri- 
fication Tool) group use Control Flow Automata (CFA) to encode control flow graph 
information in a manner suitable for model checking. In particular, the CFA framework 
provides a mechanism in which computations can be seen as paths in trees accepted by 
CFAs. The BLAST group discuss CFAs in several publications, where they are motivated 
by examples [9,10] and presented formally [8]. 

The CFA formalism characterizes a computation by a control state (or location) and 
a data state. For every function in the C program, a CFA is built. The set of control states 
is explicitly represented by a hnite control flow graph. Edges in the graph are labeled 
by symbolic terms that characterize either changes to the data state (basic blocks of 
C statements) or predicates on the data state that enable the change of control state to 
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happen (e.g. predicates computed from if statement conditionals). Later we will extend 
a CFA to allow the labels to be basic blocks of an intermediate language. 

A set of CFAs representing a program induces a transition system, which accepts an 
infinite tree of (location, data state) pairs. A stack state is implicit in the path from the 
root to any node in the tree. Each edge in the tree must correspond to a labeled transition 
in the CFA and the data state must satisfy the label on the CFA transition. The data state 
is specified abstractly so that the data states on a path in the tree accepted by the CFA are 
an abstraction of the concrete states reached during an execution of the program. The 
abstraction is selected so that the concrete states that result in the same control behavior 
are identified (states identified in this way are called a region). The tree represents the 
reachable regions of the computation. 

Model checking techniques allow this transition system to be tested for compliance 
with properties expressed in LTL, provided the LTL formula in question is built over 
atomic formulas that can be tested algorithmically on the CFA state. 

The relationship between the abstract data states of the execution tree and the regions 
of concrete states of the C program can be formulated as a Galois connection, (a, 7 ). 
The abstraction function a maps a concrete region to an abstract one. The pair induces an 
abstract transition relation -^a between abstract regions. An abstract transition system 
can be computed from the abstract state space and The abstract transition system can 

be model checked to verify the property on the concrete system. Because of the Galois 
connection requirement, the relation is an over approximation of the transition 
relation of the concrete system. As a result, if we model check the abstract transition 
system, only global properties, such as LTL formulas, can be verified on the original 
system. 

ACCEPT/C adopts Boolean abstraction, where the abstract transition -^a is defined 
by a Boolean program[2]. 



2.2 Lazy Abstraction 

Traditional predicate abstraction builds an abstract model of a program based on a fixed 
set of predicates, called the abstraction basis. We will call such systems uniform because 
they use the same abstraction basis for every statement in the source program. Traditional 
approaches are also static because they construct the model completely prior to model 
checking. 

Lazy abstraction is a dynamic, non-uniform abstraction technique. It builds a model 
on-the-fly during model checking. The abstraction function is different in different parts 
of the model checking search tree. Predicates are added only in those portions of the 
search tree where they are needed. A complete treatment of lazy abstraction can be found 
in the BLAST literature [ 8 , Section 3]. 



2.3 Abstracting Intermediate Programs 

BLAST models source programs. To support mobile code verification we must develop 
models of the intermediate code supplied from server to client. There are essentially two 
ways to approach this: directly build a model of the intermediate program or calculate 
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a refinement of a model of the source program in parallel with compilation. We have 
chosen the second approach because significantly more predicates are required at the 
intermediate level than at the source level to capture an abstract transition relation that 
enables successful model checking. In addition, to the extent that any tools exist to allow 
the programmer to assist in predicate discovery and refinement, these tools focus on 
source level abstractions. 

To retain the validity of the abstract model as we compile, we restrict the compilation 
process to preserve as much of the original control flow behavior of the program as 
possible. In particular, the intermediate program shares (more precisely, refines) the same 
control flow graph. This allows us to find a simple correspondence between the compiled 
program and the Boolean program. Namely, an intermediate block compiled from a 
source statement is associated with the Boolean program statements corresponding to 
the source statement. The compilation of an if statement is associated with the assume 
statements corresponding to the if statement. 

In addition to preservation of control flow it is also critical to preserve the semantics 
of all program values that are necessary to validate the correspondence between the 
compiled code and the Boolean program. This is accomplished by identifying a set of 
observable values in the source program and insuring that all residual representations of 
these values in the intermediate program are explicitly stored in memory. And when a 
variable is modified, the change should be written through to its memory copy. 

Finally, compilation must respect the granularity of the semantics of the temporal 
logic modalities on the source program. Basic blocks in the intermediate program can 
be no finer than the unit of observation in the semantics of the temporal modalities. A 
sufficient condition to achieve this is to restrict basic blocks in the intermediate program 
to modify at most one variable. 



3 Generation of Boolean Programs 

This section describes the calculation of the Boolean program that will ultimately be 
the certificate of the mobile code. The Boolean program will be compact and in many 
ways directly reflect the structure of the source program. In particular, every program 
point in the Boolean program will correspond to a single program location in the source 
program (and the CFA). To construct a Boolean program that can be successfully model 
checked, we must have some sufficiently refined sets of predicates. In ACCEPT/C, we 
obtain these predicates sets from the BLAST’S lazy abstraction result. 

With BLAST, the abstract model generated in lazy abstraction is dynamic and non- 
uniform. A Boolean program on the other hand is static. We will use a static, non-uniform 
Boolean program to approximate the dynamic, non-uniform model generated by BLAST 
to benefit from the strength of both approaches. 

BLAST associates a predicate set with each tree edge. To build the Boolean program, 
predicate sets on all the edges corresponding to the same transition in the CFA must be 
collected. For an edge in the source CFA, the predicates used to construct a Boolean 
program must be logically equivalent to the union of the predicate sets of the tree edges 
representing the corresponding transition of the CFA. We then apply the method de- 
scribed in SLAM [2] to compute the Boolean program. Lor example, for an assignment 
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Statement in C, we may compute an approximation of the weakest precondition of a 
predicate as the condition for the corresponding Boolean variable to be true. 

The proposition below characterizes that the static Boolean program generated is 
still as appropriate a model of the original program as the dynamic model obtained by 
lazy abstraction, provided the lazy abstraction used Boolean abstraction locally. That is, 
we show that if the lazy abstraction-based model checking does not reach an error state, 
neither will the model checking of the Boolean program constructed above. 

A state in lazy abstraction or in a Boolean program can be viewed as a bit vector. 
Assume the order of the bits are fixed, that is, the same bit in different states corresponds 
to the same predicate. We say state s refines sfafe f if s agrees wifh t on every bif of t. 

Proposition 1 If state si refines state S2, si and S2 ^2 s'2, where '^1 is an 

abstract transition relation induced by a subset of the predicates that induces abstract 
transition relation ^2. then refines to s^. 

Therefore, if none of fhe leaves on fhe search free contain an error state, neither will 
any reachable state generated by model checking the Boolean program. 

These results show that ACC does not weaken any significant properties of the models 
we construct. In Figure 2, we list the C code used in the SLAM papers. The topology of the 
corresponding control flow automaton is illustrated on the right side of Figure 2 (labels 
are compiled code). According to the SLAM paper, model checking of the correct use 
of locks is successful only after we can observe the predicate nPacket = nPacketOld. 
BLAST will find this predicate and use it to abstract the program only when necessary. 
In this case it will note that this predicate is only required within the loop. Therefore, 
we abstract the statements in the loop with three predicates and abstract the rest of the 
program with two predicates. The corresponding Boolean program is shown to the right. 



lockStatus =0; 

Loop : { 

FSMLockO; 
nPacketOld = nPackets; 

II some statements 

if (request != 0 && request->status !=0){ 
//some statements 
FSMUnlockO; 

//some statements 
nPacket ++; 

1 



if (nPacket != nPacketOld) goto Loop; 
FSMUnlockO; 




Fig. 2. Correct Use of Locks: NT Driver 
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4 Abstraction-Preserving Compilation 

The abstraction-preserving compilation process starts with the CFA representation of 
the source program and the Boolean program computed in the previous section. The 
compilation process must produce an intermediate representation of the source program 
annotated in such a manner that a Boolean program corresponding to the intermediate 
representation can be extracted by the client and its fidelity verified. This is accomplished 
by annotating basic blocks in the intermediate code with simultaneous assignments to 
the Boolean variables and annotating certain program points, usually the beginning of 
an instruction sequence for a then or else clause, with predicates from the Boolean 
program assume statements. 

We present the straightforward approach that compiles every basic block in isola- 
tion, and maintains the Boolean program essentially unmodified. An extension allowing 
optimization is explored in an earlier paper [29]. 

Each source function is characterized by a CFA and a corresponding function in the 
Boolean program. Compilation is driven by traversing the control graph in the CFA. 
CFAs have two kinds of edges, those labeled by a basic block and those labeled by 
an assume predicate. For edges labeled by predicates the compilation process emits 
an appropriate test instruction and branch construct. Such an edge is labeled by the 
predicates in the corresponding assume statement in the Boolean program. For a basic 
block edge, the basic block is compiled in isolation and annotated with the residual 
simultaneous assignment to Boolean variables found in the Boolean program. The basic 
blocks are assembled according to the control flow graph. When multiple paths merge 
it is sometimes necessary to introduce labels for intermediate proxy nodes that can be 
labeled with appropriate predicates. 

The compilation of our example to an intermediate language is illustrated in Figure 2. 
The tests of conditions (( 1 ) and (2)) in the intermediate program are not associated with an 
edge. Instead, they are associated with a node that has multiple out-edges. The direction 
of edges in the figure is from top to bottom unless explicitly indicated. 

One of the important issues is the preservation of the property of concern. This 
meta-property derives from the coherence of the semantics of the temporal logic, source 
language and intermediate language. The result is given in detail in an earlier paper [28]. 



5 Annotation Language 

The annotation language is a quantifier-free predicate logic extended with equality, 
uninterpreted function symbols, simple arithmetic and pointer dereference. The syntax is 
listed in Figure 3. Symbol c is for constants; v is for variables. Variables in the annotation 
language represent values in registers and memory. Symbol & takes the address of its 
operand. A Boolean program assignment assigns to a Boolean variable b a schoose 
expression, which is evaluated to true if the first argument is true, or false if the second 
argument is true, or * (non-deterministic choice) when neither argument is true. 

There are two forms of annotations. A set of Boolean assignments annotates an inter- 
mediate program block compiled from a source statement. The second form of annotation 
associates a program location with a predicate, representing a program assertion. For 
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expressions: E 

predicate: P 

Boolean assignment A 
Boolean assumption 



c\v\ f{E) \Ei + E2\&^E\--- 
true I -.P I Pi V P2 I • • • 

I Pi = P2 

b = schoose(Pi, P2) 
assume(P) 



Fig. 3. Syntax of the Annotation Language 



example, at the beginning of a block compiled from a C then clause of an if statement 
we may specify such an assertion. This assertion corresponds to an assume statement in 
a Boolean program. As we will see later, this form of annotation can be used to specify 
other forms of assertions. 

The reason for having pointer dereference is to handle aliases. With the presence of 
pointers, the computation of a (sound) Boolean program from a C program is expensive 
because every possible alias condition needs to be addressed; the resulting Boolean 
program tends to be unnecessarily large. Boolean abstraction on C often adopts pointer 
analysis to remove impossible cases. For example, if we have an assignment x := 0 
and predicate *p = 0 , the assignment may make the predicate true. If alias analysis can 
guarantee that p and x are not related, then in computing of the Boolean program, we 
do not have to handle the predicate *p = 0 . For the alias information to be trusted by a 
client, we need to put assertions at the beginning of the block, indicating -•{Szx = p), 
indicating that at that program point, we know p is not pointing to x. We will use the 
second form of the annotation mentioned above. Verification of such assertions is the 
same as verifying an assertion introduced by an assume statement. 



6 Generating Verification Conditions 

We send the compiled program, the predicate sets and the annotations to a client. It is 
the client’s responsibility to validate the annotations. The client will generate VCs and 
discharge them by calling a theorem proven This is different than traditional PCC. A PCC 
system usually generates these VCs on the server side; a server will invoke a certifying 
theorem prover [ 19 ] to generate proofs for these VCs. The proofs will be attached to 
the mobile program as part of the certificate. A client will verify the proof to discharge 
the VCs. In ACCEPT/C, we choose not to generate and attach the proof for the VCs to 
minimize the certificate size. As shown in Section 7 , the overhead of theorem prover 
calls in validating the Boolean program is typically small in practice. 

The VCGen applies the semantics of the intermediate language to work out the VCs. 
Specifically, the intermediate language program defines a post operator post. For a 
block B annotated by assignments of the form hy := schoose(ei, 62) in the Boolean 
program, we need to validate that e\ and 62 are the preconditions of the predicates by and 
-•by represent (which we call ey and -ie„). To do so, we compute the post conditions of 
ei and 62 and check whether they imply e„ and -ie„, respectively. Note we do not have 
to verify that ci or 62 is a best approximation of the weakest precondition. Verifying 
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that they are preconditions will he enough. Formally, we generate a set of verification 
conditions of the form: 

(post(c A ei,B) e„) A (post(c A 62 , B) (1) 

where c is true if B has no location annotation, or P if the beginning of B is annotated 
by an assertion (P). 

Consider a block B which is a test and jump. This block must be compiled from 
an if statement in C. There must be an assertion specified at the target of the jump, 
which we call Pi. If Bj is the instruction sequence that computes the test condition, 
we generate the verification condition: post(c, Bj) A Pc ^ P, where c is as defined 
above and Pc is the test condition. This VC verifies the encoded assume statement or 
any other assertions annotated at the beginning of the target. If there is an else branch, 
we also generate a VC to verify the assertion specified at the next block, if any. 

Below, we show how to combine the principles above with the semantics of the 
intermediate language to generate verification conditions. We first present the semantics 
and then describe how to compute the post condition. In the two subsections below, we 
take a small portion of the intermediate language similar to the one used in ACCEPT/C 
to demonstrate our techniques. 

6.1 Semantics of Intermediate Language 

We present a simplified version of semantics of the intermediate language, which is 
running on an abstract machine. The abstract machine assumes a number in) of registers 
and unlimited memory. The machine state in this subsection is a pair (ic, M), where ic 
is the instruction counter and M maps registers and variables to integers. For example, 
M{i) represents the i-th register and M{x) represents the value of x. Let u range over 
registers and variables. We write M\u ^ v] for a mapping where M{u) is updated to 
V. We assume that ic are integers and that J{1) returns the instruction at location 1. 

We present a selected set of instructions. These instructions and their semantics are 
listed in Figure 4. 



Instruction 


Next PC 


Effect 


Transition Condition 


add ra rb ri 


ic + 1 


M{i) := M{a) + M{b) 




store ri var 


ic + 1 


M (var) := M (i) 




jeqlri rj 


ic + 1 




M{i)\ = M{j) 




1 




M(i) = M[j) 



Fig. 4. Selected Semantic Rules 



For example, an add instruction adds two operands and puts the result in the result 
location. A store instruction takes a register and a variable and stores the content of 
the register in the variable. A test and jump instruction j eq jumps to different program 
locations according to the test. 
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6.2 VCGen 

Intuitively, generation of the verification conditions has two steps. First, we compute the 
post condition of a block of instructions. Then we compare the post condition with the 
annotations, generating a verification condition. 

In Figure 5 we list selected rules for computing the post condition. A post condition is 
considered as an abstract machine state. The abstract machine state used in this subsection 
is defined by a formula in fhe same logic in which we write annotations. We use cr for 
such an abstract state and for the value of u. For example, the rule for instruction 
add says, for an input abstract machine state cr, the next state is different from a in 
that the value of location i is the sum. Expression a\vi Vx + Vy] can be defined as 
(3uj. cr) A {vi = Vx + Vy). The old value of i is quantified. In practice this is computed 
by substituting Vi with a fresh variable unless Vi appears in the annotations[27]. 

The VCGen scans a block at a time. When the end of the block is reached, we will 
test to see whether the annotations specified for the block hold. Using (1), we generate 
verification conditions that validate the annotated Boolean assignments. We rename the 
variables if necessary because some variables in (1) are from the starting abstract state 
and some are from the resulting abstract state. As a convention, a register or variable 
subscripted with an “r” refers to the value from the final abstracf slate. 

For example, the instruction sequence for a C statement nPacket + + is: 

load Tq nPacket; add Tq rg 1; store rg nPacket 

The Boolean program of concern is var3 := schoose(0, var3). In the logic, the 
Boolean statement is encoded as var3 -ivar3r, where var3 is nPacket = 
nPacketOld. This encoded annotation reads, “if nPacket = nPacketOld in the start- 
ing abstract state, then nPacket = nPacketOld is not true in the final abslracl slate”. 
The VCGen process can be intuitively interpreted as: From an initial state cr where 
nPacket = nPacketOld, we compute the post state of the instruction sequence, which 
will be described as formulas: rg^ = nPacket + 1; nPacket^ = rg^. This formula is 
tested to see if it implies 

(nPacket = nPacketOld ^nPacket^ = nPacketOld^) 

Results from our earlier work [29] prove that these steps validate the encoded Boolean 
program. 

We also generate VCs to validate assertions. Specifically, meta-operation test inv 
generates an implication between the current abstract state we have computed and the 
assertion inv to be verified, handling necessary substitution. Meta function I maps a 
location to the assertions specified at that location, or true if none is specified. Also, 
because fhe result of pointer analysis is encoded as state assertions in the same logic, the 
verification conditions generated will verify these data flow facts as well. For example, 
for instruction j eq, we add the test condition to the current abstract state to verify the 
assertions specified at the transition target; we add the negation of the test condition to 
the abstract state to verify the assertions specified af fhe next program location. 
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Instruction 


Input State 


N extState 


VC 


add 


a 


(j[Vi 1-^ Wj, -1- Vy] 


test /(pc -|- 1) 


store Ti X 


(7 


a[vx 1-^ Vi] 


test /(pc -|- 1) 


iaload Ti Tj Vk 


a 


a\vi 1-^ *{vj + Vk)] 


test /(pc + 1) 
memsafe(access(rj, Vk)) 


jeqn Tj 


a 


a A{vi = Vj) 


test 1(1) 






a A -i{vi = Vj) 


test /(pc -|- 1) 



Fig. 5. Post Condition Computation and VC Generation 



7 Experience 

To test the feasibility of ACC, we have constructed the ACC Evaluation Prototype Toolkit 
for C (ACCEPT/C). The initial goals for ACCEPT/C, besides as a prototype implemen- 
tation of a certifying compiler, are to support critical measurements to determine the 
feasibility of the ACC approach. These measurements include the computational re- 
sources required to generate, validate and model check a certihcate. Ultimately, we 
expect to develop a full ACC implementation based on ACCEPT/C. 

ACCEPT/C builds directly on the BLAST implementation from Berkeley, which 
incorporates the CIL infrastructure for C language programs. BLAST directly gives a 
mechanism to compile C to a CFA and to apply lazy abstraction to find a set of predicates 
suitable for model checking properties of C programs. We have modihed this portion to 
measure model validation time. 

ACCEPT/C extends BLAST with the ability to recover a Boolean Program from 
the CFA and the predicate set used in BLAST’S lazy abstraction. This capability allows 
ACCEPT/C to generate models that can be checked by other model checking tools. In 
particular, we have implemented an interface to Moped [22] . 

ACCEPT/C also extends BLAST with a compilation mechanism to build annotated 
intermediate representations. This capability is still being tested. In earlier work we 
developed this capability for a simple while language. The ACCEPT AV framework was 
a more complete implementation of ACC. However, it was impractical to demonstrate 
engineering signihcance without a broadly adopted source language. 

7.1 Case Studies 

The first case study we report is based on a small collection of device drivers. These 
include a set of NT device drivers adopted by the BLAST group and a set of Linux drivers 
we collected independently. For each driver we report the size of the preproccessed 
device driver, the size of the Boolean program calculated by ACCEPT/C, the size of 
the certificate calculated by BLAST, an estimate of the model validation time based 
on theorem proving time used by BLAST in the lazy abstraction phase, and the model 
checking time in Moped. Some representative results are summarized in Figure 6. 

The results so far are encouraging. Model checking time is surprisingly low. It is 
substantially less than the estimated model validation time. While it is difficult to con- 
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elude anything from this amount of data, we suspect that the low model checking times 
has something to do with the simplicity of the domain and the safety properties studied. 





Program Size 


Cert. Size 


BLAST Cert. Size 


Model Validation Time 


Verif. Time 


Linux atp.c 


2482 


1212 


8737 


0.1s 


0.08s 


Linux ide.c 


48K 


876 


12452 


0.1s 


0.10s 


Linux audio.c 


175K 


15410 


502K 


0.2s 


0.20s 


NT cdaudio.c 


456A 


55A 


233M 


0.5s 


0.15s 


NT floppy.c 


UOK 


51A 


33M 


0.9s 


0.05s 



Fig. 6. Comparison between ACCEPT/C and BLAST 



The second case study is also taken from the BLAST examples. It is based on a 
single program, an ssh server implementation. This example allows us to explore one of 
the initial motivations for considering the use of models as certificates: in principle one 
model may be able to certify multiple properties. (Ultimately we would like to validate 
a property that was unknown at the time the model was created, perhaps due to API 
evolution.) 

The BLAST test bed identihes 13 separate properties to be verified.' The BLAST 
lazy abstraction algorithm discovers 17 predicates to establish these properties. We con- 
structed Boolean programs to verify the properties individually and collectively. The 
results show many predicates are useful to most of the properties. Figure 7 presents a 
histogram that graphs the number of properties in which each predicate participated. 
Over half of the properties used more than eight predicates in common. 

With ACCEPT/C we calculated a single Boolean Program that can be used to certify 
all 13 properties. This collective certificate of the 13 properties is only 25% larger than 
the largest certificate necessary to verify a single property. This suggests that ACC may 
be useful for building certiheates that can support validation of multiple properties. 



8 Conclusion 

Abstraction-carrying code provides a framework for the certiheation of properties of 
engineering significance about programs in an untrusted environment. Experience with 
the ACCEPT/C toolkit shows ACC certificates are reasonably compact and that the 
generation, validation, and re-verification of certificates is tractable. 

When compared with Temporal PCC, ACC may require more client side computation 
but generates significantly more compact certiheates. ACC also requires a larger trust 
base than most PCC variants. An ACC client must trust a model checker and an automatic 
theorem proving tool capable of establishing the hdelity of the certiheate. 

* The BLAST test bed formulates 16 properties, but we are able to produce the results for 13 of 
the cases. 
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# of times a predicate is used in an abstraction 




predicates 



Fig. 7. The Occurrence of Predicates in the Abstractions 



Future work includes the completion of the C-based ACC implementation. The most 
signihcant outstanding task is to implement client side support for the annotated inter- 
mediate language. In our preliminary investigation of ACC we have completed this task 
for a Java virtual machine variant. We expect the same techniques to apply. 

We believe that, in general, certihcates based on communicating an abstract model 
of a system will be more robust in use than proof based certihcates. In the future a client 
will formulate properties based on the current versions of software found on the client 
system. The server will not need to have anticipated the exact conhguration to provide 
a model that may be sufficient to validate the component. With proof based certihcates 
such Hexibility is not possible. Ultimately we expect the robustness of certihcates to be 
a more important issue than the size of the trust base. 
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Abstract. In recent work, Flanagan and Qadeer proposed atomicity 
declarations as a light-weight mechanism for specifying non-interference 
properties in concurrent programming languages such as Java, and they 
provided a type and effect system to verify atomicity properties. While 
verification of atomicity specifications via a static type system has several 
advantages (scalability, compositional checking), we show that verifica- 
tion via model-checking also has several advantages (fewer unchecked 
annotations, greater coverage of Java idioms, stronger verification). In 
particular, we show that by adapting the Bogor model-checker, we nat- 
urally address several properties that are difficult to check with a static 
type system. 



1 Introduction 

Reasoning about possible interleavings of instructions in concurrent object- 
oriented languages such as Java is difficult, and a variety of approaches including 
model checking [2,4], static and dynamic [18] analyses and type systems [7] for 
race condition detection, and theorem-proving techniques based on Hoare-style 
logics [9] have been proposed for detecting errors due to unanticipated thread 
interleavings. 

In recent work, Flanagan and Qadeer [10] noted that developers of concurrent 
programs often craft methods using various locking mechanisms with the goal 
of building method implementations that can be viewed as atomic in the sense 
that any interleaving of atomic method m’s instructions should give the same 
result as executing m’s instruction without any interleavings (i.e., in a single 
atomic step). They provide an annotation language for specifying the atomicity 
of methods, and a verification tool based on a type and effect system for verifying 
the correctness of a Java program annotated with atomicity specifications. 

We believe that the atomicity system of Flanagan and Qadeer represents a 
very useful specification mechanism, in that it checks a light-weight and semanti- 
cally simple property (e.g., as opposed to more expressive temporal specifications 
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(a) java. util. StringBuffer (excerpts) 


(b) Simple dead-locking method 


public synchronized StringBuffer 


public void ml(Object 


pi . 


append ( S t r i n g B u ffe r sb ) { 


Object 


P2) { 


if (sb— — null) { sb — NULL ; } 


int X ; 




int len = sb.length(); 


synchronized ( pi ) { 




int newcount = count + len ; 


synchronized ( p2 ) 


{ 


if (newcount > value . length ) 


x=l; 




expandCapacity ( newcount ) ; 


} 




sb.getChars(0,len , value , count); 


} 




count = newcount ; 

return this ; 

} 


} 




Fig. 1. Basic examples of non-atomic methods 





that are harder to write) that can reveal a variety of programming errors associ- 
ated with improper synchronization. For example, a common Java programming 
error is to assume that if a method is synchronized, then it is free from inter- 
ference errors. However, Flanagan and Qadeer illustrate that this is not the case 
[10, p. 345] using the java. util . StringBuffer example of Figure 1(a). Af- 
ter append calls the method sb. length (which is also synchronized), a second 
thread could remove characters from sb. Thus, length becomes stale and no 
longer reflects the current length of sb, and so getChars is called with invalid 
arguments and throws a StringlndexOutOfBoundsException. 

Inspired by the utility of the atomicity specifications, we investigated the 
extent to which these specifications could be checked effectively using model- 
checking. Clearly, checking via a type system can provide several advantages 
over model-checking: the type system approach is compositional - classes can be 
checked in isolation, a closing environment or test-harness need not be created, 
and computational complexity is lower which leads to better scalability. 

Our efforts, which we report on in this paper, indicate that there are also 
several benefits in checking atomicity specifications using model checking when 
one employs a sophisticated software-dedicated model-checking engine such as 
Bogor [16] - an extensible software model checker that we have built as the core 
of the next generation of the Bandera tool set. Bogor provides state-of-the-art 
support for model-checking concurrent object-oriented languages including heap 
symmetry reductions, garbage collection, partial order reduction (FOR) strate- 
gies based on static and dynamic (i.e., carried out during model-checking) escape 
analyses and locking disciplines, and sophisticated state compression algorithms 
that exploit object state sharing. Checking atomicity specifications with Bogor 
offers the following benefits. 

— Due to the approximate and compositional nature of the type system, several 
forms of annotations are required for sufficient accuracy such as precondi- 
tions for each method stating which locks must be held upon entry to the 
method and declarations for each lock-protected object that indicates the 
lock that must be held when accessing that object. Much of this information 
can be derived automatically during Bogor’s state-space exploration. 

— The type system cannot handle some complex locking idioms since its static 
nature requires that objects and methods be statically associated with locks 
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via annotations. Bogor’s partial order reductions strategies were developed 
specifically to accommodate these more complex idioms - therefore Bogor 
can verify many methods that rely on more complex locking schemes for 
atomicity. 

— The type system as presented in [10] requires several forms of unchecked an- 
notations or assumptions about objects shared between threads, and Flana- 
gan and Qadeer note that static escape analysis could be used to partially 
eliminate the need for these. In our previous work on using escape analysis 
to enable partial order reductions [5] , we concluded that the dynamic escape 
analysis as implemented in Bogor performs dramatically better than static 
escape analysis for reasoning about potential interference in state-space ex- 
ploration. Thus, Bogor provides a very effective solution for the unchecked 
annotations using previously implemented mechanisms. 

— The type system [10] actually fails to enforce a key condition of Tipton’s 
reduction framework, and thus will verify as atomic some methods whose 
interleaving can lead to deadlock. Bogor’s atomicity verification (based on 
the ample set partial order reduction framework which provides reductions 
that preserve LTL_x properties as well as deadlock conditions) correctly 
identifies methods that violates this condition, and thus avoids classifying 
these interfering methods as atomic. 

Finally, Bogor’s aggressive state-space optimizations enable atomicity check- 
ing without significant overhead; for most of the examples reported on in [10], 
for example, Bogor was able to complete the checks in under one second. 

We do not conclude that the model-checking approach is necessarily better 
than the type system approach to checking atomicities. As noted earlier, a type 
system approach to checking has several advantages over the model-checking 
approach. 

— The type system approach, though conservative, naturally guarantees com- 
plete coverage of all behaviors of a software unit. In contrast, model-checking 
a software unit requires a test harness that simulates interactions that the 
unit might have with a larger program context. During model-checking, the 
behaviors that are explored are exactly those generated by the test har- 
ness. It is usually difficult to guarantee that the test harness will drive the 
unit through all behaviors represented by any client context, and thus some 
behaviors that lead to atomicity violations may be missed if the test har- 
ness is not carefully constructed. Thus, checking atomicity specifications via 
model-checking is typically closer to debugging than the systematic guaran- 
tees offered by the type system approach. However, it is important to note 
that for complex programs unchecked type annotations are often required, 
consequently the type system will not cover the behavior that it assumes 
from the annotations. 

— The type system approach scales better. In addition, by its very nature, the 
type system approach is compositional, which allows program units to easily 
be checked in isolation. This enables incremental and modular checking of 
atomicity specifications. 
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We also do not conclude that all model checkers can effectively check atomic- 
ity specifications - using conventional model-checkers that do not provide direct 
support for heap-allocated data would be more difficult since information con- 
cerning locking or object-sharing is not directly represented. 

Our conclusion is that type systems and model-checking are complementary 
approaches for checking the atomicity specification of Flanagan and Qadeer [10]. 
Moreover, in model-checkers such as Bogor and JPF that provide direct support 
for objects, checking for atomicity specifications should be included because such 
specifications are useful, and model-checking provides an effective verification 
mechanism for these specifications. 

The rest of this paper is organized as follows. Section 2 provides a brief 
overview of Lipton’s reduction theory, the atomicity type/effects system of 
Flanagan and Qadeer, and Bogor’s enhancement of the ample set FOR frame- 
work. Section 3 describes how Bogor’s state-space exploration algorithm is aug- 
mented to check atomicity specifications. Section 4 presents experimental results 
drawn from the basic Java library examples used by Flanagan and Qadeer [10] 
as well as larger examples used in our previous work on partial order reductions 
[5]. Section 5 discusses related work, and Section 6 concludes. 

2 Background 

A state transition system [3] A is a quadruple {S,T, Sq, L) with a set of states 
S, a set of transitions T such that for each a G T, a C S' x S', a set of initial 
states So, and a labeling function L that maps a state s to a set of primitive 
propositions that are true at s. 

For the Java systems that we consider, each state s holds the stack frames for 
each thread (program counters and local variables for each stack frame), global 
variables (i.e., static class fields), and a representation of the heap. Intuitively, 
each a G T represents a statement or step (e.g., execution of a bytecode) that 
can be taken by a particular thread t. In general, a is defined on multiple “input 
states”, since the transition may be carried out, e.g., not only in a state s but 
also in another state s' that only differs from s in that it represents the result 
of another thread t' performing a transition on s. 

For a transition a G T, we say that a is enabled in a state s if there is a state 
s' such that a(s, s') holds. Otherwise, a is disabled in s. 



2.1 Lipton’s Reduction Theory 

Lipton introduced the theory of left/right movers to aid in proving properties 
about concurrent programs [14]. The motivation is that proofs can be made 
simpler if one is allowed to assume that a particular sequence of statements 
is indivisible, i.e., that the statements cannot be interleaved with statements 
from other threads. In order to conclude that a program P with a particular 
sequence of statements S is equivalent to a reduced program P/S where the 
statements S are executed in one indivisible transition, Lipton proposed the 
notion of commuting transitions in which particular program statements are 
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Fig. 2. Left/Right movers and atomic blocks 



identified as left movers or right movers. Intuitively, a transition a is a right 
(left) mover if whenever it is followed (preceded) by any other transition /3 of a 
different thread, then a and j3 can be swapped without changing the resulting 
state. Concretely, Lipton established that a lock acquire (e.g., such as those at 
the beginning of a Java synchronized block b) is a right mover, and the lock 
release at the end of 6 is a left mover. Any read or write to a variable/field that 
is properly protected by a lock is both a left and right mover. 

To illustrate the application of these ideas, we repeat the example given in 
[10]. Consider a method m that acquires a lock, reads a variable x protected by 
that lock, updates x, and then releases the lock. Suppose that the transitions of 
this method are interleaved with transitions Ei, E 2 , and E^ of other threads. 
Because the actions of the method m are movers (acq and rel are left and right 
movers, respectively, and the lock-protected assignment to x is both a left and 
right mover). Figure 2 implies that there exists an equivalent execution where 
the operations of m are not interleaved with operations of other threads. Thus, 
it is safe to reason about the method as executing in a single atomic step. 

Although the original presentation of Lipton is rather informal, he does iden- 
tify the property that is preserved when programs are reduced according to his 
theory: the set of final states of a program P equals the set of final states of the 
reduced program P/S; in particular P halts iff P/S halts. To achieve this prop- 
erty, Lipton imposes two restrictions on a set of statements S that are grouped 
together to form atomic statements [14, p. 719]. Restriction (Rl) states that “if 
S is ever entered then it should be possible to eventually exit S'”; Restriction 
(R2) states that “the effect of statements in S when together and separated 
must be the same.” (Rl) represents a fairly strong liveness requirement that can 
be violated if, e.g., S contributes to a deadlock or livelock, or fails to complete 
because it performs a Java wait and is never notified. (R2) is essentially stating 
an independence property for S: any interleavings between the statements of S 
do not effect the final store it produces. 



2.2 Type and Effect System for Checking Atomicity 

The type system of Flanagan and Qadeer assigns an atomicity a to each expres- 
sion of a program where a is one of the following values: const (the same value 
is always returned each time the expression is evaluated), mover (the expression 
both left and right commutes with operations of other threads), atomic (the 
expression can be viewed as a single atomic action), and compound (none of the 
previous properties hold). 

To classify a field access with a mover atomicity, the type system needs 
to know that the field is protected by a lock. The type system requires the 
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following program annotations to be attached to fields and methods to express 
this information. 

— field guarded_by 1: the lock represented by the lock expression I must be 
held whenever the field is read or written. 

— field write_guarded_by 1: the lock represented by the lock expression I must 
be held for writes, but not necessarily for reads. 

— method requires Zi, •••,/„: a method precondition that requires locks rep- 
resented by lock expressions Zi, • • • , Z„ to be held upon method entry; the 
type system verifies that these locks are held at each call-site of the method, 
and uses this assumption when checking the method body. 

If neither guarded statement is present, the field can be read or written at any 
time (such accesses are classified as atomic since these correspond to a single 
JVM bytecode). 

For soundness, the type system requires that each lock expression denote 
a fixed lock through the execution of the program. This is guaranteed in a 
conservative manner by ensuring that each lock expression has atomicity const; 
such expressions include references to immutable variables, accesses to final fields 
of const expressions, and calls to const methods. In particular, note that the 
this identifier of Java which refers to the receiver of the current method is 
const, and many classes in Java are synchronized by locking the receiver object. 

The type system contains rules for composing statement atomicities which we 
do not repeat here, but we simply note that the pattern of statement atomicities 
required for an atomic method takes the form given by the regular expression 
mover* atomic'^mover* (i.e., 0 or more movers followed by 0 or 1 atomic state- 
ments followed by 0 or more movers). 

When considering the effectiveness of the type system for enforcing the re- 
strictions (Rl) and (R2) laid out by Lipton, there are several interesting things 
to note. There is no attempt by the type system to enforce the first condition 
(Rl). This allows the method of Figure 1(b) to be assigned an atomic atomicity 
even though interleaving of its instructions can cause a deadlock when it is used 
in a context where a thread t\ calls it with objects 01,02 as the parameters and 
a thread t 2 calls it with objects 02,01 as parameters. Indeed, it is difficult to 
imagine any type system or static analysis enforcing the condition (Rl) with- 
out being very conservative (perhaps prohibitively so) or without resorting to 
various forms of unchecked annotations. 

In many cases, the type system enforces (R2) in a very conservative way. 
It does not incorporate escape information which would allow accesses to fields 
of an object o that are not lock-protected but where o is only reachable from 
a single thread to be classified as mover rather than atomic. In addition, the 
restriction on lock expressions in annotations that only allows references from 
constant fields, means that more complex locking strategies are difficult to handle 
effectively in the type system. 

2.3 Partial-Order Reductions for OO Languages 

Lipton’s theory for commuting transitions differs slightly from the conditions 
for commuting transitions used in most FOR frameworks. For example. Lip- 
ton defines a transition a as a right-mover if it right commutes with all other 
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1 seen {sq} 


7 


workSet workSet \ {a} 


2 pushStack{so) 


8 


s' Q;(s) 


3 DFS(so) 


9 


if s' 0 seen then 




10 


seen seen U {s^} 


DFS{s) 


11 


pushStack(s') 


4 workSet := ample{s) 


12 


DFS(s') 


5 while workSet ^ 0 do 


13 


popStackQ 


6 let O' G workSet 


end DFS 


Fig. 3. Depth-first search with partial order reduction (DFS-POR) 



transitions l3 from other threads. Commutativity properties in POR frameworks 
are often captured by a symmetric independence relation / on transitions such 
that I{a,j3) implies (a) if a,/3 G enahled{s) then a G enahled{j3{s)) , and (b) if 
a,(3 € enabled(s) then a(/3(s)) = f3{a{s)). The two notions of commutativity 
differ in the following ways. A right mover must right-commute with all tran- 
sitions from other threads, while the independence relation allows a transition 
to commute with only some other transitions. In addition, the commutativity 
definitions in the Lipton theory are asymmetric whereas, the independence re- 
lation / is required to be symmetric. This fact, along with the condition (a) on 
/ implies that a transition a that is a lock acquire cannot be independent of 
another transition f3 that acquires the same lock because a can disable /3 by 
acquiring the lock. However, the conditions of Lipton allow an acquire to be a 
right mover (see [14, p. 719] for details). 

In recent work [5], we described a partial order reduction framework for 
model-checking concurrent object-oriented languages that we have implemented 
in Bogor. This framework is based on the ample set approach of Peled [15] and 
detects independent transitions using locking information as inspired by the work 
of Stoller [19], reachability properties of the heap, and dynamic escape analysis. 
Independent transitions detected in our framework include accesses to fields that 
are always lock-protected or read-only, accesses to fields of objects that are only 
reachable through lock-protected objects, and accesses to thread-local objects, 
i.e., objects that are only reachable from a single thread at a given state s, etc. 
Because Bogor maintains an explicit representation of the heap, it is quite easy 
and efficient to check the heap properties necessary to detect the independence 
situations listed above. Independence properties for thread-local (non-escaping) 
objects are detected automatically by scanning the heap. Independence associ- 
ated for read-only fields require a read-only annotation on a field, and a simple 
lock-protected annotation on a field that is protected by a lock (both of these 
annotations are checked for violations during model-checking). Note that our 
locking annotation is simpler than the annotation of Flanagan and Qadeer in 
that one does not need to give a static expression indicating the guarding lock. 
Dynamically scanning the heap during model checking allows independence to be 
detected in a variety of settings, e.g., when the lock protecting an object changes 
during program execution or that locking is not required in states where the ob- 
ject associated with the field is thread-local. 

Figure 3 presents the depth-first search state-exploration algorithm that in- 
corporates the ample set partial order reduction strategy [3, p. 143]. Space con- 
straints do not allow a detailed explanation, but the intuition of the approach 
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is a follows. At each state s with outgoing enabled transitions enabled{s) = 
{oi, . . . an, Pi, ■ ■ • Pm}, ample{s) returns a subset {ai, . . . o;„} to expand. The 
selection of ample(s) is constrained in such a way as to guarantee that the re- 
duced system explored by Figure 3 satisfies the same set of LTL_jc formulas 
as the unreduced system (in which all enabled transitions from each state are 
expanded). Following [3, p. 158], our strategy for choosing ample(s) is to find a 
thread t such that all of its transitions at s are independent of transitions from 
all other threads at s, and return that set enabled{s,t) as the value of ample{s) 
(technically, a few other conditions must be satisfied to preserve LTL_jc proper- 
ties). If such a thread t cannot be found, ample{s) is set to enabled{s). We refer 
the reader to [5] for details. 



3 Model-Checking Atomicity Specifications 



We have presented two frameworks for describing commuting transitions: the 
Lipton mover framework, and the independence framework used in FOR. We 
now describe how atomicity specifications can be checked by augmenting Fig- 
ure 3 to classify transitions according to each of these frameworks. In effect, this 
yields two approaches for checking atomicity specifications using model checking: 
MC-mover which will classify transitions according to Lipton’s framework and 
MC-ind which classifies transitions according to the independence framework. 
During our presentation, we will compare these approaches and contrast them 
with the Type-System checking approach of Flanagan and Qadeer. 

In our atomicity checker, a method m or block of code can be annotated 
as compound (the default specification which every method trivially satisfies), 
atomic, or mover. Intuitively, a mover method is a special case of an atomic 
where all transitions are both left and right movers according to MC-mover or all 
independent according to MC-ind. For simplicity, we will only describe checking 
atomic methods since it is obvious that checking the patterns for mover will 
only require very minor changes. 

In MC-ind, transitions can be classified uniquely as independent (I) or de- 
pendent (D) in Bogor by leveraging locking and heap structure information. In 
MC-mover, transitions can be classified as left-mover (L), right-mover (R), or 
atomic (A), and we allow a single transition to have both L and R classifications. 

For an atomic method m, the goal of MC-ind checking is to guarantee that the 
transitions of m follow the pattern l*D' I* every time method m is encountered 
during the state-space search. For each thread t, the checker uses an internal 
region position variable held in the state-vector with values (N,I,D) to mark 
where t’s program counter (pc) is positioned with respect to the required transi- 
tion pattern. N indicates t’s pc is not in a block annotated as atomic, I indicates 
t’s pc is in an atomic block, but has not yet executed a D transition, and D in- 
dicates t’s pc is in an atomic block, and has already executed a D transition. 
Checking using MC-mover looks for the pattern R*A' L* with region position val- 
ues (N,R,L) where R indicates that t’s pc is in an atomic block but has not yet 
executed an A or L transition, and L indicates t has executed an A (i.e., t’s pc 
is moving into or through the L region of the pattern). 
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Fig. 4. Additions to DFS-POR for MC-ind 



In addition to checking that m conforms to the required pattern of transi- 
tions, the checker will also flag as “unverified” any atomic or mover method 
where a thread that cannot eventually move past the end of that method. We 
have two approaches for implementing this. The first uses the standard approach 
of Spin [12] to detect invalid end-states (completions of execution where a thread 
has not moved past the end of an atomic method) as well as nested depth-first 
search to look for non-progress cycles (indicating live-locks). Nested depth-first 
search makes this approach more expensive, so we implemented a cheaper scheme 
that looks directly for cycles in a lock-holding graph and can catch many com- 
mon cases. 

Figure 4 presents the MC-ind algorithm. Due to space constraints, we have 
elided the parts of the algorithm (using . . . ) that do not change from the basic 
DFS algorithm presented in Figure 3. Line numbers x.y of Figure 4 are inserted 
after the line number x of Figure 3. 

The input of the algorithms is a set of regions TZatom? where each region 
represents a method or block of code that is annotated as atomic. In this pre- 
sentation, we represent a region p G TZatom? as a non-empty finite sequence of 
transitions [o;i,...,a„] drawn from the same thread. For simplicity, we assume 
that regions do not overlap with each other, and there are no loop edges across 
regions.^ Given a thread t and a state s, we define the function region ^^^^7 to 
return a singleton set containing the region p G TZatom? if one of t's transition at 
state s is in the sequence of p. Otherwise, it returns the empty set. 



^ Our actual implementation handles all of Java, and allows e.g., atomic methods to 
call other atomic methods, etc. 
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Fig. 5. Additions to DFS-POR for MC-mover 



Region positions are maintained using the position table A4md (line 2.1) 
that maps each thread t to its position status {N,I,D}. Given a thread t and 
a position table Ai, we define the function getRegionPosition to return N if 
t ^ dom{M)', otherwise, it returns M{t). 

Note that transition classifications are necessarily conservative since the anal- 
ysis must be safe, and so a method may actually be atomic, yet fail to be classified 
as such because of imprecision in the Bogor independence detection algorithm. 
Yet there are some cases where we can tell that there is definitely an atomicity 
violation (e.g., if a transition executes a wait lock operation, or a deadlock or cy- 
cle within the region is found) . We wish to distinguish these cases when reporting 
to the user, and we use the set TZ® C TZatom? to hold the regions that definitely 
violate their atomicity annotations (line 2.1), so we use the set TZf^j^ C IZatoTn? to 
hold the regions that possibly violate their atomicity annotations as determined 
by transition classifications (lines 2. 2-2. 3). 

The algorithm proceeds as follows (for liveness issues, we include an explana- 
tion of our less expensive checks, since the strategy using the nested depth-first 
search is well-documented [12]). Once the ample set is constructed for the current 
state (line 4), we check if there exists a thread such that it is in one of its atomic 
regions and it is in a (fully/partially) deadlock state (line 4.1 and lines 52-59). 
If a deadlock exists, then the atomic region cannot really be atomic (given a 
set of threads 9, we define the function circSync to return True if there is cycle 
in the lock-holding/acquiring relation for threads of 9). Once a transition a is 
selected from the work set, then we save the region position of the thread t that 
is executing a (lines 6. 1-6.2) so that we can restore the position marker when 
backtracking later on (line 13.2). After a is executed, we update the region po- 
sition of t (line 8.1). If the next state s' has been seen before, we check if s' is in 
the stack. If it is, then there is a cycle (non-terminating loop) in the state space. 
Thus, if t is in the same atomic region at state s', then that region definitely 
cannot be atomic (line 13.1 and lines 36-38). 

The region position is updated as follows {updateind Atomic) . If t is not in 
one of its atomic regions at state s', and if t’s status at state s was other than iV, 
then the status is updated to N (lines 41-43). If a is a wait transition, then the 
annotated atomic region where t is at state s' definitely cannot be atomic (line 
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(a) “Corrected” StringBuffer (excerpts) (b) Replicated Workers (excerpts) 



public synchronized StringBuffer 

append ( S t r i n g B u ffe r sb ) { 
synchronized ( sb ) { 

if (sb null) { sb ^ NULL; } 
int len = sb.length(); 
int newcount = count + len; 
if (newcount > value . length ) 
expandCapacity( newcount ) ; 
sb . getChars (0 , len , value , count ) ; 
count = newcount ; 
return this; 



public final synchronized 

Vector take ( ) { 
Vector returnVal = 

new Vector ( theCollection 
. size ( ) ) ; 
for ( int i =0; 

i < theCollection. size (); 

i +-t) { 

returnVal . addElement 

( theCollection . take ( ) ) ; 

} 

return returnVal; 



Fig. 6. Java examples illustrating atomicity checking by invisibility vs. Lipton’s 
movers 



44-45). If t’s status was N, then we can infer that t has just entered one of its 
atomic regions. Thus, the status is updated to I (lines 46-47). If t’s status was 
/ and a is not an independent transition at state s, then t’s status is updated 
to D (lines 48-49). Lastly, if t’s status was D and a is not an independent 
transition, then we can infer that we have seen two non-independent transitions 
inside the atomic region. Thus, the required atomicity pattern has been violated 
(lines 50-51). The checking algorithm for MC-mover is similar and is presented 
in Figure 5. 

Note that considering both styles of transition classification with MC-ind and 
MC-mover rather than some hybrid approach allows us to highlight the subtle 
differences between the frameworks. Also, using the approaches as they currently 
exist allows us to appeal to previous correctness results for both frameworks 
instead of proving the correctness of a combined approach. Taking MC-ind as an 
example, if the algorithm confirms that a method m is atomic with respect to 
a given test harness (i.e., environment that closes the system and provides the 
calls into the classes under analysis), then, for that particular test harness, we 
can conclude that m always completes, and previously established correctness 
results for our notion of independence [5] allow us to conclude that there is no 
interference with the statements of the methods. 



3.1 Assessing MC-ind vs. MC-mover 

Recall the StringBuffer example of Figure 1(a) which was not atomic because 
threads could interfere with the variable sb between a sequence of reads. Fig- 
ure 6(a) attempts to solve this problem by locking the sb variable. With this 
modification, Type-System would verify that this method is atomic. However, in 
a manner similar to Figure 1(b), this method can cause a deadlock if append is 
called from two different threads with the receiver object and method parameter 
reversed in the calls. Using a test harness with two threads that call append with 
arguments in opposing orders, both MC-ind and MC-mover would detect that the 
method is non-atomic. 
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public abstract class RWVSN { 

protected Vector wai t ing Wr it er M o nit o r s _ = new Vector(); 

protected synchronized void no t i f y Wr i t er ( ) { 
if ( w ait i ng Wr it er M on i tor s _ . s i z e ( ) > 0) { 

Object oldest = waitingWriterMonitors_.firstElement(); 
waitingWriterMonitors_ . removeElementAt (0) ; 
synchronized(oldest) { oldest. notify(); } 

++a c tiveWriters. ; 

}}} 

Fig. 7. A readers/ writers example (excerpts) 



Now consider Figure 6(b), which is taken from the replicated worker frame- 
work presented in [6]. In this method, there is a series of lock acquires/releases 
on the vectors referenced by theCollection and returnVal when the sizeO 
and takeO methods are called. Thus, using the mover transition classification, 
the transitions have the form ..R..L..R..L which means that neither Type-System 
nor MC- mover can confirm that this method is atomic. However, due to the way 
that the data structures are set up, the vector referenced by the theCollection 
field is always dominated by the receiver object’s lock (and thus protected by 
it). In addition, our dynamic escape analysis can detect that the returnVal lo- 
cal variable refers to an object that is thread-local at every point it is accessed 
during the method. Using our independence classification, the transitions (even 
the acquire/releases) have the form b, and therefore MC-ind can verify that the 
method is indeed atomic. 

Thus, there are some cases where MC-ind can detect atomicity but MC-mover 
cannot (in the case above, due to the fact that the dynamic escape analysis 
detects that the lock acquires are not actually needed to obtain non-interference 
for that method). Similarly, there are cases when MC-mover can detect atomicity 
but MC-ind cannot. This typically occurs when there is a method with nested lock 
acquire/release which has a pattern RR..LL in MC-mover but DD..II in MC-ind. 
Since both methods are conservative but safe with respect to the supplied test 
harness, even if only one method concludes that a method is atomic, then we 
can safely conclude that the method is atomic with respect to the supplied test 
harness. We take advantage of this in an alternate implementation that runs 
both methods simultaneously. Specifically, we weave line 2.1, 2.3, 4.3, 8.1, and 
13.2 of Figure 4 and Figure 5 and replace line 3.3 as follows: 

3.3 foreach G Cl do 

That is, only if both algorithms agree that a region is possibly not atomic do 
we report that it is possibly not atomic. As will be discussed in the experiment 
section, the combination of the algorithms is precise enough to determine the 
methods in Figure 6 are atomic. 

The approach can be further improved by noting that methods that fail 
MC-mover due to patterns like ..acq ?i..rel ^i..acq l 2 -.re\ h-- (i-e., R..L..R..L) often 
are still atomic because l\ and I 2 are from objects oi and 02 that are either 
thread-local or already dominated by (and thus protected by) another lock. 
Figure 7 presents a readers/ writers example^ that illustrates this case (note 

^ http : //gee . cs . oswego . edu/dl/ cpj/classes/RWVSN . java 
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Table 1. Experiment Data 



Example (I) (L) (C) (W) 



Original String-buffer 


Trans: 383 


check 


9 


9 


9 


9 


Threads: 3 


States 9 


noise 


0 


0 


0 


0 


Locations: 175 


Time: .26 


error 


1 


1 


1 


1 


Deadlock String-buffer 


Trans: 99 


check 


9 


9 


9 


9 


Threads: 3 


States: 4 


noise 


0 


0 


0 


0 


Locations: 183 


Time: .13 


error 


1 


1 


1 


1 


Readers Writers 


Trans: 127337 


check 


23 


23 


23 


26 


Threads: 5 


States: 2504 


noise 


3 


3 


3 


0 


Locations: 314 


Time: 19 


error 


0 


0 


0 


0 


Replicated Workers 


Trans: 230894 


check 


38 


38 


39 


39 


Threads: 4 


States: 4477 


noise 


1 


1 


0 


0 


Locations: 509 


Time: 63.88 


error 


0 


0 


0 


0 



the sequence of acquire/release pairs represented by the method calls on the 
waitingWriterMonitors- and the synchronization on oldest). However, the 
read-only waitingWriterMonitors_ field points to a Vector object that is lock 
dominated (hence protected) by the receiver RWVSN object. 

Based on this observation, we can modify the existing mover classification (in 
which an acquire I is never considered a left mover except when it actually repre- 
sents a re-acquire of 1) to take into account thread-locality and lock domination 
(protection) information accumulated by Bogor: 

— In the right mover region of an atomic region, a left mover is now defined 
to be a lock release when the lock being released is non-thread-local or not 
protected by some locks held by the current executing thread. 

~ In the left mover region of an atomic region, a right mover is now defined 
to be a lock acquire when the lock being acquired is non-thread-local or not 
protected by some locks held by the current executing thread. 

That is, we avoid changing the region position information when encountering 
lock acquires/releases that do not actually play a role in protecting an object. 
This change allows MC-mover to correctly identify the method of Figure 7 as 
atomic since the lock operations on waitingWriterMonitors_ do not change 
the the region position (i.e., one has the pattern up to the point where the 
lock on object is released, and the remaining increment can be classified as L). 

Note that Type-System cannot establish that this method is atomic because 
its rule for synchronize I does not recognize the fact that some other lock may 
already by protecting I, nor does it take into account thread-locality information. 



4 Experimental Results 

Figure 1 presents the results of running the examples that we have described 
previously on an Opteron 1.8 GHz (32-bit mode) with maximum heap of 1 
Gb using the Java 2 Platform (we have also run almost all of the examples 
from [10] with results similar to what is shown above). In all runs, we used all 
reductions available in Bogor described in [5] with the addition of the read-only 
reduction [19] The (I), (L), (C), and (W) denote the MC-ind, MC-mover, the 
combination of MC-ind and MC-mover and and the combination that uses the 
modified definition of left/right movers presented at the end of the previous 
section. For each example, we give the number of threads and the locations or 
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control points of the example that are executed. Furthermore, we give the number 
of transitions and states, and time (in seconds) needed to run each test case. The 
maximum memory consumption of the examples is 2.69 Mb memory, which is 
for the replicated workers example, and the minimum memory consumption is 
.8 Mb for the String-buffer example. For each example, check is the number of 
atomic methods that are verified, noise is the number of atomic methods that 
cannot be verified due to imprecision of the algorithm, and error is the number 
of specified atomic methods that definitely cannot be atomic. 

For Original String-buffer, we correctly detect the atomicity violation of 
append as reported by [10], and for Deadlock String-buffer, we correctly de- 
tect the atomicity violation due to deadlock that is not detected by [10]. Note, 
however that this relied on constructing a test harness that happened to ex- 
pose the deadlocking schedule. Note that the Readers/ Writers and the Repli- 
cated Workers example contains Java synchronized collection library such as the 
java. Icuig. Vector class that are also indirectly verified for atomicity via calls 
from the larger examples. In addition, we verify the notifyReaders, 
notifyWriter, afterRead, and afterWrite methods of the Readers Writers ex- 
ample as atomic as well as various methods for synchronization in the Replicated 
Workers. 

The timing numbers indicate the model-checking approach is feasible for unit- 
testing. Fewer annotations are required ~ we do not require the pre-condition 
annotations of Type-System that indicate the locks that are held when a method 
is called, and although we do require annotations stating that fields are lock 
protected, our approach infers the protecting objects instead of having them 
stated directly. Flanagan and Qadeer note that they run all their experiments 
with an unchecked annotation that allows their checker to assume that an object 
does not escape its constructor. Such a property is automatically checked in 
Bogor’s dynamic escape analysis. 

We have already noted that to apply our model-checking approach, the user 
must construct a test harness (environment) that generates appropriate coverage 
of the system’s execution paths. It is possible that the model-checking approach 
can fail to detect errors because the behavior of the environment is not suffi- 
ciently rich. Also note that the model-checking approaches will likely check a 
method m many times during the course of verification whereas the type system 
only needs to make one pass. Of course the benefit is an increase in the precision 
due to customization of reasoning to the invocation contexts. 

5 Related Work 

We have already made extensive comparisons with Flanagan and Qadeer’s type 
system [10], which inspired the work presented in this paper. Their type system 
is based on the earlier type system for detecting race conditions [7]. Flanagan 
and Freund developed Atomizer [8], a runtime verification tool to check atom- 
icity specifications that uses a variant of the Eraser algorithm to detect the 
lock-sets that protect shared variables and thread- local variables similar to [5]. 
However, in their algorithm, shared variables cannot become unshared later on, 
and they also failed to enforce Lipton’s R1 condition. Wang and Stoller [21] also 
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developed a similar runtime tool for checking atomicity. One notable feature of 
their tool is that given a trace, the tool permutes the ordering of events to find 
atomicity violations. Both of these run-time checking approaches scale better 
than our model-checking approach because they do not store states and they do 
not consider every possible interleaving. Of course, this means that they are also 
more likely to fail to detect atomicity violations since many feasible paths are 
not examined. 

Our previous work on partial order reductions [5] combined escape analysis 
with lock-based reduction strategies formalized earlier by Stoller [19] and im- 
plemented in JPF [2]. Stoller and Cohen have recently developed a theory of 
reductions using abstract algebra [20], and their framework addresses many of 
the issues related to independence and commutativity discussed here. 

Other interesting efforts on software model-checking such as [1,11] have not 
considered many of the issues addressed here, since those projects have focused 
on automated abstraction of sequential C programs and have not included con- 
currency nor dynamically created objects. 

Finally, model-checkers such as Spin [13] include the keyword atomic. How- 
ever, this represents a directive to the model-checker to group transitions to- 
gether without any interleaving rather than a specification to be verified against 
an implementation in which a developer has tried to achieve non-interference 
using synchronization mechanisms. 



6 Conclusion 

Flanagan and Qadeer have argued convincingly that atomicity specifications and 
associated verification mechanisms are useful in the context of concurrent pro- 
gramming. We believe that our work demonstrates that model-checking using 
state-of-the-art software model-checkers like Bogor provides an effective means 
of checking atomicity specifications. The key enabling factors are (1) Bogor’s 
partial order reduction strategies based on dynamically accumulated locking 
and object escape information that is more difficult to obtain in static type sys- 
tems, and (2) model-checking enables the verification process to easily enforce 
the “must complete” requirement from Lipton’s theory which was not enforced 
in the type system of Flanagan and Qadeer (and indeed, would be difficult to 
enforce in any type system without resorting to very conservative approxima- 
tions). An extended version of this paper and more information about examples 
and experimental results can be found on the Bogor web site [17]. 
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1 Introduction 

This paper describes experiences with software model checking after several years 
of using static analysis to find errors. We initially thought that the trade-off 
between the two was clear: static analysis was easy but would mainly find shallow 
bugs, while model checking would require more work but would be strictly better 
— it would find more errors, the errors would be deeper, and the approach would 
be more powerful. These expectations were often wrong. 

This paper documents some of the lessons learned over the course of using 
software model checking for three years and three projects. The first two projects 
used both static analysis and model checking, while the third used only model 
checking, but sharply re-enforced the trade-offs we had previously observed. 

The first project, described in Section 3, checked FLASH cache coherence 
protocol implementation code [1]. We first used static analysis to find violations 
of FLASH-specific rules (e.g., that messages are sent in such a way as to prevent 
deadlock) [2] and then, in follow-on work, applied model checking [3]. A startling 
result (for us) was that despite model checking’s power, it found far fewer errors 
than relatively shallow static analysis: eight bugs versus 34. 

The second project, described in Section 4, checked three AODV network 
protocol [4] implementations. Here we first checked them with CMC [5], a model 
checker that directly checks C implementations. We then statically analyzed 
them. Model checking worked well, finding 42 errors (roughly 1 per 300 lines of 
code), about half of which involve protocol properties difficult to check statically. 
However, in the class of properties both methods could handle, static analysis 
found more errors than model checking. Also, it took much less effort: a couple 
of hours, while our model checking effort took approximately three weeks. 

The final project, described in Section 5, used CMC on the Linux TCP net- 
work stack implementation. The most startling result here was just how difficult 
it is to model check real code that was not designed for it. It turned out to be 
easier to run the entire Linux Kernel along with the TCP implementation in 
CMC rather than cut TCP out of Linux and make a working test harness. We 
found 4 bugs in the Linux TCP implementation. 

The main goal of this paper is to compare the merits of the two approaches 
for finding bugs in system software. In the properties that could be checked 
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by both methods, static analysis is clearly more successful: it took less time 
to do the analysis and found more errors. The static analysis simply requires 
that the code be compiled, while model checking a system requires a carefully 
crafted environment model. Also, static analysis can cover all paths in the code 
in a straightforward manner. On the other hand, a model checker executes only 
those paths that are explicitly triggered by the environment model. A common 
misconception is that model checking does not suffer from false errors, while 
these errors typically inundate a static analysis result. In our experience, we 
found this not to be true. False execution paths in the model checker can be 
triggered by erroneous environments, leading to false errors. These errors can be 
difficult to trace and debug. Meanwhile, false errors in static analysis typically 
arise out of infeasible paths, which can be eliminated by simple analysis or even 
unsubstantial manual inspection. 

The advantage of model checking is in its ability to check for a richer set of 
properties. Properties that require reasoning about the system execution are not 
amenable to static checking. Many protocol specific properties such as routing 
loops and protocol deadlocks fall in this category. A model checker excels in 
exploring intricate behaviors of the system and finding errors in corner cases 
that have not been accounted for by the designers and the implementors of the 
system. However, the importance of checking these properties should significantly 
over-weigh the additional effort required to model check a system. 

While this paper describes drawbacks of software model checking compared 
to static analysis, it should not be taken as a jeremiad against the approach. We 
are very much in the “model checking camp” and intend to continue research 
in the area. One of the goals of this paper is to recount what surprised us 
when applying model checking to large real code bases. While more seasoned 
minds might not have made the same misjudgments, our discussions with other 
researchers have shown that our naivete was not entirely unreasonable. 



2 The Methodologies 

This paper is a set of case studies, rather than a broad study of static analysis 
and model checking. While this limits the universality of our conclusions, we 
believe the general trends we observe will hold, though the actual coefficients 
observed in practice will differ. 



2.1 The Model Checking Approach 

All of our case studies use traditional explicit state model checkers [6, 7]. We do 
no innovation in terms of the actual model checking engine, and so the challenges 
we face should roughly mirror those faced by others. We do believe our conclu- 
sions optimistically estimates the effort needed to model check code. A major 
drawback of most current model checking approaches is the need to manually 
write a specification of the checked system. Both of our approaches dispense with 
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this step. The first automatically extracts a slice of functionality that is trans- 
lated to the model checking language, similar to the automatic extraction work 
done by prior work, notably Bandera [8] and Feaver [9]. Our second approach 
eliminates extraction entirely by directly model checking the implementation 
code. It is similar to Verisoft [10], which executes C programs and has been suc- 
cessfully used to check communication protocols [11] and Java PathFinder [12], 
which uses a modified Java virtual machine that can check concurrent Java pro- 
grams. 



2.2 The Static Analysis Approach 

The general area of using static analysis for bug finding has become extremely 
active. Some of the more well-known static tools include include PREfix [13], 
ESP [14], ESC [15], MOPS [16] and SLAM [17], which combines aspects of both 
static analysis and model checking. 

The static tool approach discussed in this paper is based on compiler ex- 
tensions ( “checkers” ) that are dynamically linked into the compiler and applied 
flow-sensitively down a control-flow graph representation of source code [18]. 
Extensions can perform either intra- or inter-procedural analysis at the discre- 
tion of the checker writer. In practice, this approach has been effective, finding 
hundreds to thousands of errors in Linux, BSD, and various commercial systems. 

While we make claims about “static analysis” in general, this paper focuses 
on our own static analysis approach (“metacompilation” or MC), since it is the 
one we have personal experience. The approach has several idiosyncratic features 
compared to other static approaches that should be kept in mind. In particular, 
these features generally reduce the work needed to find bugs as compared to 
other static analysis techniques. 

First, our approach is unsound. Code with errors can pass silently through 
a checker. Our goal has been to find the maximum number of bugs with the 
minimum number of false positives. In particular, when checkers cannot deter- 
mine a needed fact they do not emit a warning. In contrast, a sound approach 
must conservatively emit error reports whenever it cannot prove an error cannot 
occur. Thus, unsoundness lets us check effectively properties that done soundly 
would overwhelm the user with false positives. 

Second, we use relatively shallow analysis as compared to a simulation-based 
approach such as in PREfix [13]. ^ Except for a mild amount of path-sensitive 
analysis to prune infeasible paths [19], we do not: model the heap, track most 
variable values, or do sophisticated alias analysis. A heavier reliance on simula- 
tion would increase the work of using the tool, since these often require having 
to build accurate, working models of the environment and of missing code. In 
a sense simulation pushes static analysis closer to model checking, and hence 
shares some of its weaknesses as well as strengths. 

^ Our approach has found errors in code checked by PREfix for the same properties, 
so the depth of checking is not entirely one-sided. 
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Third, our approach tries to avoid the need for annotations, in part by using 
statistical analysis to infer checkable properties [20]. The need for annotations 
would dramatically increase the effort necessary to use the tool. 

3 Case Study: FLASH 

This section describes our experience checking FLASH cache coherence protocol 
code using both static analysis and model checking [1] . The FLASH multiproces- 
sor implements cache coherence in software. While this gives ffexibility it places 
a serious burden on the programmer. The code runs on each cache miss, so it 
must be egregiously optimized. At the same time a single bug in the controller 
can deadlock or livelock the entire machine. 

We checked five FLASH protocols with static analysis and four with model 
checking. Protocols ranged from lOK to 18K lines and have long control fiow 
paths. The average path was 73 to 183 lines of code, with a maximum of roughly 
400 lines. Intra-procedural paths that span 10-20 conditionals are not uncom- 
mon. For our purposes, this code is representative of the low-level code used on 
a variety of embedded systems: highly optimized, difficult to read, and difficult 
to get correct. For the purpose of finding errors, FLASH was a hard test: by 
the time we checked it had already undergone over five years of testing under 
simulation, on a real machine, and one protocol had even been model checked 
using a manually constructed model [21]. 



3.1 Checking FLASH with Static Analysis 

While FLASH code was difficult to reason about, it had the nice property that 
many of the rules it had to obey mapped clearly to source code and thus 
were readily checked with static analysis. The following rule is a representa- 
tive example. In the FLASH code, incoming message buffers are read using the 
macro MISCBUS_READJDB. All reads must be preceded by a call to the macro 
WAIT_FDRJDB_FULL to synchronize the buffer contents. To increase parallelism, 
WAIT_FDRJDB_FULL is only called along paths that require access to the buffer 
contents, and it is called as late as possible along these paths. This rule can 
be checked statically by traversing all program paths until we either (1) hit a 
call to WAIT_FORJDB_FULL (at which point we stop following that path) or (2) 
hit a call to MISCBUS_READ_DB (at which point we emit an error). In general the 
static checkers roughly follow a similar pattern: they match on specific source 
constructs and use a extensible state machine framework to ensure that the 
matched constructs occur (or do not occur) in specific orders or contexts. 

Table 1 gives a representative listing of the FLASH rules we checked. Since 
the primary job of a FLASH node is to receive and respond to cache requests, 
most rules involve correct message handling. The most common errors were not 
deallocating message buffers (9 errors) and mis-specifying the length of a message 
(18 errors). The other rules were not easier, but generally had less locations where 
they had to be obeyed. There were 33 errors in total and 28 false positives. We 
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Table 1. Representative FLASH rules the number of lines of code for a MC rule checker 
(LOG), the number of bugs the checker found (Bugs) as well as the number of false 
positives (FP). We have elided other less useful checkers; in total, they found one more 
bug at a cost of about 30 false positives. 



Rule 


LOG 


Bugs 


False 


WAIT.FOR_DB_FULL must come before MISCBUS.READ.DB 


12 


4 


1 


The has_data parameter for message sends must match the 
specified message length (be one of LEN.NDDATA, LEfLWORD, or 
LEN.CACHELINE). 


29 


18 


2 


Message buffers must be: allocated before use, deallocated after, 
and not used after deallocation. 


94 


9 


25 


Message handlers can only send on pre-specified “lanes.” 


220 


2 


0 


Total 


355 


33 


28 



obtained these numbers three years ago. Using our current system would have 
reduced the false positive rate, since most were due to simple infeasible paths 
that it can eliminate. ( The severity of the errors made the given rate perfectly 
acceptable.) 

3.2 Model Checking FLASH 

While static analysis worked well on code-visible rules, it has difficulty with 
properties that were not visible in the source code, but rather implied by it, 
such as invariants over data structures or values produced by code operations. 
For example, that the sharing list for a dirty cache line is empty or that the 
count of sharing nodes equaled the number of caches a line was in. On the other 
hand, these sort of properties and FLASH structure in general are well-suited 
to model checking. 

Unfortunately, the known hard problem with using model checking on real 
code is the need to write a specification (a “model”) that describes what the 
software does. For example, it took a graduate student several months to build 
hand- written, heavily-simplified model of a single FLASH protocol [21]. Our 
model checking approach finessed this problem by using static analysis to auto- 
matically extract models from source code. We started the project after noticing 
the close correspondence between a hand- written specification of FLASH [21]) 
with the implementation code itself. FLASH code made heavy use of stylized 
macros and naming conventions. These “latent specifications” [20] made it rela- 
tively easy to pick out the code relevant to various important operations (mes- 
sage sends, interactions with the I/O subsystem, etc) and then automatically 
translate them to a checkable model. 

Model checking with our system involves the following four steps. First, the 
user provides a metal extension that when run by our extensible compiler marks 
specific source constructs, such as all message buffer manipulations or sends. 
These extensions are essentially abstraction functions. Second, the system then 
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automatically extracts a backward slice of the marked code, as well as its depen- 
dencies. Third, the system translates the sliced code to a Mur;/? model. Fourth, 
the Mur(/J model checker checks the generated model along with a hand-written 
environment model. 

Table 2 lists a representative subset of the rules we checked that static anal- 
ysis would have had difficulty with. Surprisingly, there were relatively few errors 
in these properties as compared to the more shallow properties checked with 
static analysis. 



Table 2. Description of a representative subset of invariants checked in four FLASH 
protocols using model checking. Checking these with static analysis would be difficult. 

Invariants 

The RealPtrs counter does not overflow (RealPtrs maintains the number of sharers). 
Only a single master copy of each cache line exists (basic coherence). 

A node can never put itself on the sharing list (sharing list is only for remote nodes). 
No outstanding requests on cache lines that are already in Exclusive state. 

Nodes do not send network messages to themselves. 

Nodes never overflow their network queues. 

Nodes never overflow their software queues (queue used to suspend handlers). 

The protocol never tries to invalidate an exclusive line. 

Protocol can only put data into the processor’s cache in response to a request. 



3.3 Myth: Model Checking Will Find More Bugs 

The general perception within the bug-finding community is that since model 
checking is “deeper” than static analysis then if you take the time to model check 
code, you will find more errors. We have not found this to be true, either in this 
case study or in the next one. For FLASH, static analysis found roughly four 
times as many bugs as model checking, despite the fact that we spent more time 
on the model checking effort. Further, this differential was after we aggressively 
tried to increase bug counts. We were highly motivated to do so since we had 
already published a paper that found 34 bugs [2]; publishing a follow-on paper 
for a technique that required more work and found fewer was worrisome. In the 
end, only two of the eight bugs found with model checking had been missed by 
static analysis. Both were counter overflows that were deeper in the sense that it 
required a deep execution trace to find them. While they could potentially have 
been found with static analysis, doing so would have required a special-case 
checker. 

The main underlying reason for the lower bug counts is simple: model check- 
ing requires running code, static analysis just requires you compile it. Model 
checking requires a working model of the environment. Environments are often 
messy and hard to specify. The formal model will simplify it. There were five 
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main simplifications that caused the model checker to miss FLASH bugs found 
with static analysis: 

1. We did not model cache line data, though we did model the state that cache 
lines were in, and the actual messages that were sent. This omission both 
simplified the model and shrank the state space. The main implication in 
terms of finding errors was that there was nothing in the model to ensure that 
the data buffers used to send and receive cache lines were allocated, deleted 
or synchronized correctly. As a result, model checking missed 13 errors: all 
nine buffer allocation errors and all four buffer race conditions. 

2. We did not model the FLASH I/O subsystem, primarily because it was so 
intricate. This caused the model checker to miss some of the message-length 
errors found by the static checker. 

3. We did not model uncached reads or writes. The node controllers support 
reads and writes that explicitly bypass the cache, going directly to mem- 
ory. These were used by rare paths in the operating system. Because these 
paths were rare it appears that testing left a relatively larger number of er- 
rors on them as compared to more common paths. These errors were found 
with static analysis but missed by the model checker because of this model 
simplification. 

4. We did not model message “lanes.” To prevent deadlock, the real FLASH 
machine divides the network into a number of virtual networks (“lanes”). 
Each different message type has an associated lane it should use. For sim- 
plicity, our model assumed no such restrictions. As a result, we missed the 
two deadlock errors found with static analysis. 

5. FLASH code has many dual code paths — one used to support simulation, 
the other used when running on the actual FLASH hardware. Errors in the 
simulation code were not detected since we only checked code that would 
actually run on the hardware. 

Taking a broader view, the main source of false negatives is not incomplete 
models, but the need to create a model at all. This cost must be paid for each new 
checked system and, given finite resources, it can preclude checking new code or 
limit checking to just code or properties whose environment can be specified with 
a minimum of fuss. A good example for FLASH is that time limitations caused 
us to skip checking the “sci” protocol, thereby missing five buffer management 
errors (three serious, two minor) found with static analysis. 



3.4 Summary 

Static analysis works well at checking properties that clearly map to source code 
constructs. Model checking can similarly leverage this feature to automatically 
extract models from source code. 

As this case study shows, many abstract rules can be checked by small, simple 
static checkers. Further, the approach was effective enough to find errors in code 
that was (1) not written for verification and (2) had been heavily-tested for over 
five years. 
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After using both approaches, static analysis had two important advantages 
over model checking. First, in sharp contrast to the thorough, working code 
understanding demanded by model checking, static analysis allowed us to un- 
derstand little of FLASH before we could check it, mainly how to compile it and 
a few sentences describing rules. Second, static analysis checks all paths in all 
code that you can compile. Model checking only checks code you can run; and 
of this code, only of paths you execute. This fact hurt its bug counts in the next 
case study as well. 

4 Case Study: AODV 

This section describes our experiences finding bugs in the AODV routing protocol 
implementation using both model checking and static analysis. We first describe 
CMC, the custom model checker we built, give an overview of AODV, and then 
compare the bugs found (and not found) by both approaches. 

4.1 CMC Overview 

While automatically slicing out a model for FLASH was far superior to hand 
constructing one, the approach had two problems. First, it required that the user 
have a intimate knowledge of the system, so that they could effectively select and 
automatically mark stand-alone subparts of it. Second, Mury, like most modeling 
languages lacks many C constructs such as pointers, dynamic allocation, and bit 
operations. These omissions make general translation difficult. 

We countered these problems by building CMC, a model checker that checks 
programs written in C [5]. CMC was motivated by the observation that there 
is no fundamental reason model checkers must use a weak input language. As 
it executes the implementation code directly, it removes the need to provide an 
abstract model, tremendously reducing the effort required to model check a sys- 
tem. As the implementation captures all the behaviors of the system, CMC is no 
longer restricted to behaviors that can be represented in conventional modeling 
languages. CMC is an explicit state model checker that works more or less like 
Mury though it lacks many of Mury’s more advanced optimizations. As CMC 
checks a full implementation rather than an abstraction of it, it must handle 
much larger states and state spaces. It counters the state explosion problem by 
using aggressive approximate reduction techniques such as hashcompaction [22] 
and various heuristics [5] to slice unnecessary detail from the state. 



4.2 AODV Overview 

AODV (Ad-hoc On-demand Distance Vector) protocol [4] is a loop-free routing 
protocol for ad-hoc networks. It is designed to handle mobile nodes and a “best 
effort” network that can lose, duplicate and corrupt packets. 

AODV guarantees that the network is always free of routing loops. If an 
error in the specification or implementation causes a routing loop to appear in 
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Table 3. Properties checked in AODV. 



Assertion Type 


Examples 


Generic 


Segmentation violations, memory leaks, dangling pointers. 


Routing Loop 


The routing tables of all nodes do not form a routing loop. 




At most one routing table entry per destination. 
No route to self in the AODV-UU implementation. 


Routing Table 


The hop count of the route to self is 0, if present. 

The hop count is either infinity or less than the number of nodes 
in the network. 


Message Field 


All reserved fields are set to 0. 

The hop count in the packet can not be infinity. 



Table 4. Lines of implementation code vs. CMC modeling code. 



Protocol 


Checked 


Correctness 


Environment 


State 




Code 


Specification 


network 


stnbs 


Canonicalization 


mad-hoc 


3336 


301 


400 


100 


165 


Kernel AODV 


4508 


301 


400 


266 


179 


AODV-UU 


5286 


332 


400 


128 


185 



the network, the protocol has no mechanism to detect or recover from them, 
allowing the loop to persist forever, completely breaking the protocol. Thus, it is 
crucial to comprehensively test both the AODV protocol specification and any 
AODV implementation for loop freeness as thoroughly as possible. 

AODV is relatively easy to model check. Its environmental model is greatly 
simplified by the fact that the only input it deals with are user requests for a 
route to a destination. This can be easily modeled as a nondeterministic input 
that is enabled in all states. Apart from this, an AODV node responds to two 
events, a timer interrupt and a packet received from other AODV nodes in the 
network. Both are straightforward to model. 



4.3 Model Checking AODV with CMC 

We used CMC to check three publicly-available AODV implementations: mad- 
hoc (Version 1.0) [23], Kernel AODV (Version 1.5) [24], and AODV-UU (Ver- 
sion 0.5) [25]. While it is not clear how well these implementations are tested, 
they have been used in different testbeds and network simulation environments [26] . 
On average, the implementations contain 6000 lines of code. 

For each implementation, the model consists of a core set of unmodified files. 
This model executes along with an environment which consists of a network 
model and simplified implementations (or “stubs”) for the implementation func- 
tions not included in the model. Table 4 describes the model and environment 
for these implementations. All three models reuse the same network model. 
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Table 5. Number of bugs of each type in the three implementations of AODV. The 
figures in parenthesis show the number of bugs that are instances of the same bug in 
the mad-hoc implementation. 





mad- hoc 


Kernel AODV 


1 

> 

Q 

0 

C 


Mishandling malloc failures 


4 


6 


2 


Memory Leaks 


5 


3 


0 


Use after free 


1 


1 


0 


Invalid Routing Table Entry 


0 


0 


1 


Unexpected Message 


2 


0 


0 


Generating Invalid Packets 


3 


2 (2) 


2 


Program Assertion Failures 


1 


1 (1) 


1 


Routing Loops 


2 


3 (2) 


2 (1) 


Total 


18 


16 (5) 


8 (1) 


LOG per bug 


185 


285 


661 



Table 6. Comparing static analysis (MC) and CMC. Note that for the MC results 
we only ran a set of generic memory and pointer checkers rather than writing AODV- 
specific checkers. Generating the MC results took about two hours, rather than the 
weeks required for AODV. 







Bugs Found 






CMC & MC 


CMC alone 


MC alone 


Generic 


Mishandling malloc failures 


11 


1 


8 


Properties 


Memory Leaks 


8 


- 


5 




Use after free 


2 


- 


- 




Invalid Routing Table Entry 


- 


1 


- 


Protocol 


Unexpected Message 


- 


2 


- 


Specific 


Generating Invalid Packets 


- 


7 


- 




Program Assertion Failures 


- 


3 


- 




Routing Loops 


- 


7 


- 




Total 


21 


21 


13 



As CMC was being developed during this case study, it is difficult to gauge 
the time spent in building these models as opposed to building the model checker 
itself. As a rough estimation, it took us two weeks to build the first, mad-hoc 
model. Building subsequent models was easier, and it took us one more week to 
build both these models. 

Table 3 describes the assertions CMC checked in the AODV implementations. 
CMC automatically checks certain generic assertions such as segmentation vio- 
lations. Additionally, the protocol model checks that routing tables are loop free 
at all instants and that each generated message and route inserted into the table 
obey various assertions. Table 4 gives the lines of code required to add these 
correctness properties. 

CMC found a total of 42 errors. Of these, 35 are unique errors in the imple- 
mentations and one is an error in the underlying AODV specification. Table 5 
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summarizes the set of bugs found. The Kernel AODV implementation has 5 
bugs (shown in parentheses in the table) that are instances of the same bug in 
mad-hoc. The AODV specification bug causes a routing loop in all three imple- 
mentations. 

4.4 Static AODV Checking: More Paths + More Code = 

More Bugs 

We also did a cursory check of the AODV implementations using a set of static 
analysis checkers that looked for generic errors such as memory leaks and invalid 
pointer accesses. The entire process of checking the three implementations and 
analyzing the output for errors took two hours. Static analysis found a total of 
34 bugs. 

Table 6 compares the bugs found by static analysis and CMC. It classifies the 
bugs found into two broad classes depending on the properties violated: generic 
and protocol specific. For generic errors, our results matched those in the FLASH 
case study: static analysis found many more bugs than model checking. Except 
for one, static analysis found all the bugs that CMC could find. As in our previous 
case study (§3.3), the fundamental reason for this difference is that static analysis 
can check all paths in all code that you can compile. In contrast, model checking 
can only check code triggered by the specific environment model. Of the 13 errors 
not found by CMC, 6 are in parts of the code that are either not included in 
the model or cut out during environment modeling. For instance, static analysis 
found two cases of mishandled malloc failures in multicast routing code. All our 
CMC models omitted this code. 

Additionally, CMC missed errors because of subtle mistakes in its environ- 
ment model. For example, the mad-hoc implementation uses the sencLdatagram 
function to send a packet and has a memory leak when the function fails. How- 
ever, our environment erroneously modeled the send_datagram as always suc- 
ceeding. CMC thus missed this memory leak. Such environmental errors caused 
CMC to miss 6 errors in total. Static analysis found one more more error in dead 
code that can never be executed by any CMC model. ^ 



4.5 Where Model Checking Won: More Checks = More Bngs 

In the class of protocol-specific errors, CMC found 21 errors while static analysis 
found none. While this was partly because we did not check protocol-specific 
properties, many of the errors would be difficult to find statically. We categorize 
the errors that were found my model checking but missed by static analysis into 
three classes and describe them below. 

By executing code, a model checker can check for properties that are not 
easily visible to static inspection (and thus static analysis). Many protocol spe- 
cific properties fall in this class. Properties such as deadlocks and routing loops 

^ Static analysis also found a null pointer violation in one of our environment models! 
We do not count this error. 
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involve invariants of objects across multiple processes. Detecting such loops stat- 
ically would require reasoning about the entire execution of the protocol, a dif- 
ficult task. Also, present static analyzers have difficulty analyzing properties 
of heap objects. The error in Figure 1 is a good example. This error requires 
reasoning about the length of a linked list, similar to many heap invariants 
that static analyzers have difficulty with. Here, the code attempts to allocate 
rerrhdrjnsg . dst _cnt temporary message buffers, ft correctly checks for malloc 
failure and breaks out of the loop. However, it then calls rec_rerr which con- 
tains a loop that assumes that rerrhdrjnsg. dst _cnt list entries were indeed 
allocated. Since the list has fewer entries than expected, the code will attempt 
to use a null pointer and get a segmentation fault. 

/ / aodv_deamon . c : aodv_recv_message 

for(rerri=0; rerri<rerrhdr_msg.dst_cnt;rerri++) ■[ 

// Step 1: break with < dst_cnt elements in rerrhdr_msg list, 
if ( ! (tp = mallocC . . . ) ) ) 
break; 

tp->next = rerrhdr_msg . unr_dst ; // enqueue onto list. 

rerrhdr_msg.unr_dst = tp; 



} 

// Step 2: rec_rerr assumes dst_cnt elements in rerrhdr_msg 
rec_rerr (&inf o_msg, &rerrhdr_msg) ; 

int rec_rerr (struct info *tmp_info, struct rerrhdr *rh) { 

// Step 3: iterates rh->dst_cnt times even if not that many elements 
ford = 0, t = rh->unr_dst ; i < rh->dst_cnt; i++, tp = tp->next) { 

// ERROR: tp can be null! 
tmp_rtentry = getentry (tp->unr_dst_ip) ; 

Fig. 1. The one memory error missed by static analysis: requires reasoning about 
values, and is a good example of where model checking can beat static analysis. 



A second advantage model checking has is that it checks for actual errors, 
rather than having to reason about all the different ways the error could be 
caused. If it catches a particular error type it will do so no matter the cause of 
the error. For example, a model checker such as CMC that runs code directly will 
detect all null pointer dereferences, deadlocks, or any operation that causes a 
runtime exception since the code will crash or lock up. Importantly, it will detect 
them without having to understand and anticipate all the ways that these errors 
could arise. In contrast, static analysis cannot do such end-to-end-checks, but 
must instead look for specific ways of causing a given error. Errors caused by 
actions that the checker does not know about or cannot analyze will not be 
flagged, and so minimize false positives by looking for errors only in specific 
analyzable contexts. 
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A good example is the error CMC found in the AODV specification, shown 
in Figure 2. This error arises because the specification elides a check on the 
sequence number of a received packet. Here, the node receives a packet with 
a stale route with an old sequence number. The code (and the specification) 
erroneously updates the sequence number of the current route without checking 
if the route in the packet is valid. This results in a routing loop. Once the cause 
of the routing loop is known, it is possible (and easy) to statically ensure that all 
sequence number updates to the routing table from any received packets involve 
a validity check. However, there are only a few places where such specialized 
checks can be applied, making it hard to recoup the cost of writing the checker. 
Moreover, exhaustively enumerating all different causes for a routing loop is not 
possible. On the other hand, a model checker can check for actual errors without 
the need for reasoning about their causes. 

In a more general sense, model checking’s end-to-end checks mean it can 
give guarantees much closer to total correctness than static analysis can. No one 
would be at all surprised if code that passed all realistic static checks immediately 
crashed when actually run. On the other hand, given a good environment model 
and input sequences, it is much more likely that model checked code actually 
works when used. Because model checking can verify the code was actually 
totally correct on the executions that it tested, then if state reduction techniques 
allow these executions to cover much of the initial portion of the search space, it 
will be difficult for an implementation to efficiently get into new, untested areas. 
At risk of being too optimistic this suggests that even with a residual of bugs in 
the model checked implementation it will be so hard to trigger them that they 
are effectively not there. 



// madhoc : rerr . c : rec_rerr 

// recv_rt: route we just received from network. 

// cur_rt : current route entry for same IP address. 

cur_rt = getentry (recv_rt->dst_ip) ; 
if(cur_rt != NULL && ...) { 

// Bug: updates sequence number without checking that 
// received packet newer than route table entry! 
cur_rt->dst_seq = recv_rt->dst_seq; 

Fig. 2. The AODV specification error. A common pattern: this bug could have been 
caught with static analysis, but there were so few places to check that it would be 
difficult to recover the overhead of building checker. 



A final model checking advantage was that there were true bugs that both 
methods would catch, but because they “could not happen” would be labeled 
as false positives when found with static analysis. In contrast, because model 
checking produces an execution trace to the bug, they would be correctly labeled. 
The best example was a case where an AODV node receives a “route response” 
packet in reply to a “route request” message it has sent. The code first looked 
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up the route for the reply packet’s IP address and then used this route table 
entry without checking for null. While a route table lookup can return null in 
general, this particular lookup “cannot be null,” since a node only sends route 
requests for valid route table entries. If this unchecked dereference was flagged 
with static analysis it would be labeled a false positive. However, if the node 
(1) sends this message, (2) reboots, and (3) receives the response the entry can 
be null. Because model checking gave the exact sequence of unlikely events the 
error was relatively clear. 



4.6 Summary 

The high bit for AODV: model checking hit more properties, but static hit 
more code; when they checked the same property static won. The latter was 
surprising since it implies that most bugs were shallow, requiring little analysis, 
perhaps because code difficult for the analysis to understand is similarly hard 
for programmers to understand. As with FLASH, the difference in time was 
significant: hours for static versus weeks for model checking. 

One view of the trade-off between the approaches is that static analysis checks 
code well, but checks the implications of code relatively poorly. On the other 
hand, model checking checks implications relatively better, but because of its its 
problems with abstraction and coverage, can be less effective checking the actual 
code itself. 

These results suggest that while model checking can get good results on real 
systems code, in order to justify their significant additional effort they must 
target properties not checkable statically. 

5 Case Study: TCP 

This section describes our efforts in model checking the Linux TCP implementa- 
tion. We decided to check TCP after the relative success of AODV since it was 
the hardest code we could think of in terms of finding bugs. There were several 
sources of difficulty. First, the version we checked (from Linux 2.4.19) is roughly 
ten times larger than AODV code (50K lines versus 6K). Second, it is mature, 
often frequently audited code. Third, since almost all Linux sites constantly use 
TCP, it is one of the the heaviest-tested pieces of open source code around. 

We had expected TCP would require only modest more effort than AODV. 
As Section 5.1 describes below, this expectation was wildly naive. Further, as 
Section 5.2 shows, it is a very different matter to get code to run at all and 
getting it to run so that you can comprehensively test it. 

5.1 The Environment Problem: Lots of Time, Lots of False 
Positives 

The system to be model checked is typically present in a larger execution context. 
For instance, the Linux TCP implementation is present in the Linux kernel, 
and closely interacts with other kernel modules. Before model checking, it is 
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necessary to extract the relevant portions of the system to be model checked 
and create an appropriate environment that allows the extracted system to run 
stand alone. This environment should contain stubs for all external functions 
that the system depends on. With an implementation-level model checker like 
CMC, this process is very similar to building a harness for unit testing. However, 
building a comprehensive environment model for a large and complex system can 
be difficult. This difficulty is known to an extent in the model checking literature, 
but is typically underplayed. 

Extracting code amounts to deciding for each external function that the 
system calls, whether the function should be included in the checked code base, or 
to instead create a stub for the function and include the stub in the environment 
model. The advantages of including the function in the checked code are (1) the 
model checker executes the function and thus can potentially find errors in it (2) 
there is no need to create the stub or to maintain the stub as the code evolves. 
However, including an external function in the system has two downsides: (1) 
this function can potentially increase the state space and (2) it can call additional 
functions for which stubs need to be provided. 

Conventional wisdom dictates that one cut along the narrowest possible in- 
terface. The idea is that this requires emulating the fewest possible number of 
functions, while minimizing the state space. However, while on the surface this 
makes sense, it was an utter failure for TCP. And, as we discuss below, we expect 
it to cause similar problems for any complex system. 



Failure: building a kernel library. Our first attempt to model check TCP 
did the obvious thing: we cut out the TCP code proper along with a few tightly 
coupled modules such as IP and then tried to make a “kernel library” that 
emulated all the functions this code called. Unfortunately, TCP’s comprehensive 
interaction with the rest of the kernel meant that despite repeated attempts and 
much agonizing we could only reduce this interface down to 150 functions, each 
of which we had to write a stub for. 

We abandoned this effort after months of trying to get the stubs to work 
correctly. We mainly failed because TCP, like other large complex subsystems, 
has a large, complex, messy interface its host system. Writing a harness that 
perfectly replicates the exact semantics of this (often poorly documented) in- 
terface is difficult. In practice these stubs have a myriad of subtle corner-case 
mistakes. Since model checkers are tuned to find inconsistencies in corner cases, 
they will generate a steady stream of bugs, all false. To make matters worse, 
these false positives tend to be much harder to diagnose than those caused by 
static analysis. The latter typically require seconds or, rarely, several minutes to 
diagnose. In contrast, TCP’s complexity meant that we could spend days trying 
to determine if an error report was true or false. One such example: an apparent 
TCP storage leak of a socket structure was actually caused by an incorrect stub 
implementation of the Linux timer model. The TCP implementation uses a func- 
tion mod_timer 0 to modify the expiration time of a previously queued timer. 
This function’s return value depends on whether the timer is pending when the 
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function is called. Our initial stub always returned the same value. This mistake 
confused the reference counting mechanism of the socket structures in an obscure 
way, causing a memory leak, which took a lot of manual examination to unravel. 

It is conceivable that with more work we could have eventually replicated 
all 150 functions in the interface to perfection (at least until their semantics 
changed). However, we did not seem to be reaching a fixed point — in fact, each 
additional false positive seemed harder to diagnose. In the end, it was easier to 
do the next approach. 

Surprising success: run Linux in CMC. While it seems intuitive that one 
should cut across the smallest interface point, it is also intuitive that one cut 
along well-defined and documented interfaces. It turns out that for TCP (and 
we expect for complex code in general) the latter is a better bet. It greatly sim- 
plifies the environment modeling. Additionally, it makes it less likely that these 
interfaces will change in future revisions, allowing the same environment model 
to be (re)used as the system implementation evolves. While the approach may 
force the model checker to deal with larger states and state spaces, the benefits 
of a clean environment model seem to outweigh the potential disadvantage. 

It turns out that for TCP there are only two very well-defined interfaces: (1) 
the system call interface that defines the interaction between user processes and 
the kernel and (2) the “hardware abstraction layer” that defines the interaction 
between the kernel and the architecture. Cutting the code at this level means 
that we wind up pulling the entire kernel into the model! While initially this 
sounded daunting, in practice it turned out to not be that difficult to “port” 
the kernel to CMC by providing a suitable hardware abstraction layer. This ease 
was in part because we could reuse a lot of work from the User Mode Linux 
(UML) [27]) project, which had to solve many of the same problems in its aim 
to run a working kernel as a user process. 

In order to check the TCP implementation for protocol compliance, we wrote 
a TCP reference model based on the TCP RFC. CMC runs this alongside the 
implementation, providing it with the same inputs as to the implementation, 
and reports if their states are inconsistent as a protocol violation error. 

5.2 The Coverage Problem: No Execute, No Bug 

As with dynamic checking tools, model checking can only find errors on executed 
code paths. In practice it is actually quite difficult to exercise large amounts of 
code. This section measures how comprehensively we could check TCP. 

We used two metrics to measure coverage. The first is line coverage of the 
implementation achieved during model checking. While the crudeness of this 
measure means it may not correspond to how well the system has been checked, 
it does effectively detect the parts that have not been tested. The second is “pro- 
tocol coverage,” which corresponds to the abstract protocol behaviors tested by 
the model checker. We calculate protocol coverage as the line coverage achieved 
in the TCP reference model mentioned above. This roughly represents the degree 
to which the abstract protocol transitions have been explored. 
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Table 7. Coverage achieved during model refinement. The branching factor is a mea- 
sure of the state space size. 



Description 


Line 

Coverage 


Protocol 

Coverage 


Branching 

Factor 


Bugs 


Standard server and client 


47.4 % 


64.7 % 


2.91 


2 


-I- simultaneous connect 


51.0 % 


66.7 % 


3.67 


0 


-1- partial close 


52.7 % 


79.5 % 


3.89 


2 


-1- message corruption 


50.6 % 


84.3 % 


7.01 


0 


Combined Coverage 


55.4 % 


92.1 % 







We used the two metrics to detect where we should make model checking 
more comprehensive. Low coverage often helped in pointing out errors in our 
environment model. Table 7 gives the coverage achieved with each step in the 
model refinement process. We measured coverage cumulatively using three search 
techniques: breadth- first, depth-first, and random. In random search, each gen- 
erated state is given a random priority. Table 7 also reports the branching factor 
of the state space as a measure of its size — lower branching factors are good, 
since they mean the state increases exponentially less each step in. For the first 
three models the branching factor is calculated from the number of states in the 
queue at depth 10 during a breadth first search. For the fourth model, CMC ran 
out of resources at depth 8, and the branching factor is calculated at this depth. 

The first model consists of a single TCP client communicating with a single 
TCP server. Once the connection is established, the client and server exchange 
data in both directions before closing the connection. This standard model dis- 
covered two protocol compliance bugs in the TCP implementation. The second 
model adds multiple simultaneous connections, which are initiated nondetermin- 
istically. The third model lets either end of the connection nondeterministically 
decide to close it during data transfer. This improved coverage and resulted 
in the discovery of two more errors. Finally, much of the remaining untested 
functionality was code to handle bad packets, so we corrupted packets by non- 
deterministically toggling selected key control flags in the TCP packet. While 
these corrupted packets triggered a lot of recovery code they also resulted in 
an enormous increase in the state space. Tweaking the environment the right 
way to achieve a more effective search still remains an interesting but unsolved 
problem. 

In the end we detected four errors in the Linux TCP implementation. All are 
instances where the implementation fails to meet the TCP specification. These 
errors are fairly complex and require an intricate sequence of events to trigger 
the error. 
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6 Conclusion 

This paper has described trade-offs between both static analysis and model 
checking, as well as some of the surprises we encountered while applying model 
checking to large software systems. Neither static analysis nor model checking 
are at the stage where one dominates the other. Model checking gets more prop- 
erties, but static analysis hit more code; when they checked the same property 
static analysis won. 

The main advantages static analysis has over model checking: (1) gets all 
paths in all code that can compile, rather than just executed paths in code 
you can run, (2) only requires a shallow understanding of code, (3) applies in 
hours rather than weeks, (4) easily checks millions of lines of code, rather than 
tens of thousands, (5) can find thousands of errors rather than tens. The first 
question you ask with static analysis is “how big is the code?” Nicely, bigger is 
actually better, since it lets you amortize the fixed cost of setting up checking. 
Model checking’s first question is “what does the code do?” This is both because 
many program classes cannot be model checked and because doing so requires an 
intimate understanding of the code. Finally, given enough code we are surprised 
when static analysis gets no results, but less surprised if model checking does 
not (or if the attempt abandoned) . Most of these are direct implications of the 
fact that model checking runs code and static analysis does not. 

We believe static analysis will generally win in terms of finding as many 
bugs as possible. In this sense it is better, since less bugs gets users closer to 
the desired goal of the absence of bugs (“total correctness”). However, model 
checking has advantages that seem hard for static analysis to match: (1) it can 
check the implications of code, rather than just surface-visible properties, (2) it 
can do end-to-end checks (the routing table has no loops) rather than having to 
anticipate and craft checks for all ways that an error type can arise, (3) it gives 
much stronger correctness results — we would be surprised if code crashed after 
being model checked, whereas we are not surprised at all if it crashes after being 
statically checked. 

A significant model checking drawback is the need to create a working, cor- 
rect environment model. We had not realized just how difficult this would be 
for large code bases. In all cases it added weeks to months of effort compared 
to static analysis. Also, practicality forced omissions in model behavior, both 
deliberate and accidental. In both FLASH and AODV, unmodeled code (such 
as omitting the I/O system or multicast support) led to many false negatives. 
Finally, because the model must perfectly replicate real behavior, we had to 
fight with many (often quite-tricky-to-diagnose) false positives during develop- 
ment. In TCP this problem eventually forced us to resort to running the entire 
Linux kernel inside our model checker rather than creating a set of fake stubs to 
emulate TCP’s interface to it. This was not anticipated. 

All three model checking case studies reinforced the following four lessons: 

1 . No model is as good as the implementation itself. Any modification, transla- 
tion, approximation done is a potential for producing false positives, danger 
of checking far less system behaviors, and of course missing critical errors. 
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2. Any manual work required in the model checking process becomes immensely 
difficult as the scale of the system increases. In order to scale, model checker 
should require as little user input, annotations and guidance as possible. 

3. If an unit-test framework is not available, then define the system boundary 
only along well-known, public interfaces. 

4. Try to cover as much as possible: the more code you trigger, the more bugs 
you find, and more useful model checking is. 
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Abstract. We present a generic framework for the automatic and mod- 
ular inference of sound class invariants for class-based object oriented 
languages. The idea is to derive a sound class invariant as a conservative 
abstraction of the class semantics. In particular we show how a class 
invariant can be characterized as the solution of a set of equations ex- 
tracted from the program source. Once a static analysis for the method 
bodies is supplied, a solution for the former equation system can be itera- 
tively computed. Thus, the class invariant can be automatically inferred. 
Moreover, our framework is modular since it allows the derivation of class 
invariants without any hypothesis on the instantiation context and, in 
the case of subclassing, without accessing to the parent code. 



1 Introduction 

A class is correct or incorrect not by itself, but with respect to a specifica- 
tion. For instance, a specification can be the absence of runtime errors, such 
as null-pointers dereference or the absence of uncaught exceptions. More gener- 
ally, a specification can be expressed in a suitable formal language. The software 
engineering community [14] proposes to annotate the source code with class in- 
variants, method preconditions and postconditions in order to specify the desired 
behavior of the class. A class invariant is a property valid for each instance of the 
class, before and after the execution of any method, in any context. A method 
precondition is a property satisfied before the execution of the method and a 
postcondition is a property valid just after its execution. 

The natural question with such an approach is: “Does the class respect its 
specification?”. The traditional approach is to monitor the assertions, so that 
for instance the preconditions and the class invariant are checked before the 
execution of a method. Such an approach has many drawbacks. For example, it 
requires checking arbitrary complex assertions so that it may introduce a non- 
negligible slowdown at runtime. Moreover it is inherently not sound. In fact the 
code must be executed in order to test if an assertion is violated or not. However, 
program execution or testing can only cover finitely many test cases so that the 
global validity of the assertion cannot be proved. Therefore the need for formal 
methods arises. 

The approach based on abstract interpretation consists in computing an ap- 
proximation of the class semantics and then to check whether it satisfies the 
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specification. In particular if a sound class invariant, which is inferred from the 
program source, matches the specification then the class itself matches the spec- 
ification (because of soundness) . Therefore a static analyzer capable of inferring 
sound class invariants can be used as an effective verification tool [3]. Further- 
more class invariants can be used to optimize the compiled-code, e.g. to drop 
superfluous exception handlers or synchronizations, and for code documentation. 

An effective static analysis framework for object oriented languages must 
take into account three issues: 

- the different instantiation contexts of a class ( context-modularity) ; 

- the presence in the class to be verified of references to classes whose code is 

not available (tios-a-modularity); 

- the specialization through inheritance of a class whose source code is not 

available (zs-a-modularity). 

Our Work. We present a generic framework for the automatic and modular 
inference of sound class invariants for class-based object oriented languages. 

Our framework is highly generic: it is language independent and more im- 
portant any abstract domain can be plugged in. As static analyses consider 
particular properties (e.g. pointer aliasing or linear relationships betweens vari- 
ables) the choice of a particular analysis influences the property reflected by 
the so-inferred class invariant. And hence it influences the check of the program 
specification. For instance, if the specification is, in some formal language, “The 
class never raises the null-pointer exception” then we are likely to instantiate 
the framework with a pointer analysis, in order to show that the methods in the 
class never cause the throwing of such an exception. 

The framework is fully context-modular, as a class can be analyzed aside from 
the context in which it is used. It is tias-a-modular, because if a class C references 
a class H (e.g. C contains a variable of type H or a method of C calls a function 
of H) then the H invariant can replace the H source code. Furthermore it is is- 
a-modular because a subclass can be analyzed by just accessing the superclass 
invariant, and not its source code. 

Related Work. Several static analyses have been developed for object ori- 
ented languages as e.g. [2] and some of them have modular features e.g. [4] in 
that they analyze a program fragment without requiring the full program to 
stay in memory. Nevertheless they are different from the present work in that 
none of them is able to discover class invariants, essentially because they do not 
take into account peculiarities of object oriented languages such as the state 
encapsulation. 

Some other works can infer class invariants w.r.t. to a particular property [1, 
17,8,9]. Our work is more general than these. For instance, we consider arbitrary 
properties, we do not require any human interaction (unlike [17,9]) and we deal 
with the problem of computing subclass invariants without seeing the parent’s 
code. Moreover, our approach is based on a conservative approximation of the 
class semantics so, unlike [8], it is sound. 

In a previous work [12] we introduced a different approach for the analysis of 
object oriented languages, based on the use of symbolic relations and class-to- 
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class StackError extends Exception •[} 
public class Stack •[ 

// Inv : 1 <= size, 0 <= pos <= size, size=stack . length 

protected int size, pos; 
protected Object [] stack; 

Stack(int size) •[ 

this. size = Math . max ( s ize , 1) ; this. pos = 0; 
this . stack = new Object [this . size] ; 

> 

boolean isEmptyO { return (pos <= 0); } 

boolean isFullO { return (pos >= size); } 

Object topO throws StackError •[ 
if ( ! isEmpty () ) 

// 0 <= pos-1 < stack. length 
return stack[pos-l] ; 
else throw new St ackErr or ( ) ; 

> 

void push(0bject o) throws StackError { 
if ( ! isFull () ) i 

f I 0 <= pos < size 
stack[pos+ + ] = o; 

} else throw new StackError () ; 

> 

void popO throws StackError { 
if ( ! isEmpty ( ) ) 
pos ; 

else throw new St ackErr or () ; 

} 

> 



Fig. 1. Java source for the Stack class. 



class transformations. The main novelty of the present work is the handling of 
inheritance. In fact, we introduce a generic framework for the modular inference 
of class invariants, in which arbitrary abstract domains can be plugged-in. Then, 
we show how it can be used for inferring subclass invariants without accessing 
to the parent code. To the best of our knowledge the present paper is the first 
one to address the problem of zs-o-modular analyses. 

2 Examples 

We will illustrate the results of this paper through the examples in Fig. 1 and 
Fig. 2. The first class. Stack, is the implementation of a stack parameterized 
by its size, specified at object creation time. It provides methods for pushing 
and popping elements as well as testing if the stack is empty or full. Moreover, 
as the internal representation of the stack is hidden a stack object can only be 
manipulated through its methods. The second class, StackWithUndo extends the 
first one by adding to the stack the capability of performing the undo of the last 
operation. 
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The comments in the figures are automatically derived when the framework 
presented in this paper is instantiated with the Octagon abstract domain [15]. 
In particular, for the Stack we have been able to discover the class invariant 
Inv without any hypothesis on the class instantiation context so the analysis 
is context-modular. The invariant Inv guarantees that the array stack is never 
accessed out of its boundaries. This implies that the out-of-bounds exception is 
never thrown (verification) so that the bounds checks at lines 20 and 26 can be 
omitted from the generated bytecode (optimization). 

The class invariant for StackWithUndo, Sublnv, states that the parent class 
invariant is still valid and moreover the field undoType cannot assume values 
outside the interval [—1, 1]. It implies that the method undo will never raise the 
exception StackErr. Once again this information can be used for verification (if 
a class never raises an exception Exc, then the exceptional behavior described 
by Exc is never shown) and for optimization (as the exception handling can be 
dropped). Finally it is worth noting that Sublnv has been obtained without 
accessing to the parent code but just to its class invariant, thus in an is-a- 
modular fashion. 

3 Syntax and Concrete Semantics 

The concrete semantics describes the properties of the execution of programs. 
The goal of a static analysis is to provide an effective computable approximation 
of the concrete semantics [5]. Therefore, the first step for the design of a static 
analysis is the definition of the concrete semantics. 

A class is a template for objects. It is provided by the programmer that spec- 
ifies the fields, the methods and the class constructor. Then it can be abstractly 
modeled by a triple, as in the following definition: 

Definition 1. A class C is a triple (F, init,M) where Y is a set of distinct vari- 
ables, init is the class constructor and VI is a set of function definitions. 

It is worth noting that for the sake of generality in the previous definition 
we did not ask to have typed fields or methods. Moreover we assume that all 
the class fields are protected. This is just to simplify the exposition, and it does 
not cause any loss of generality: in fact any external access to a field f can be 
simulated by a couple of methods set_f/ get_f . 

Given a class C = (F,init,M), every instance of C has an internal state ct G A 
that is a function from fields to values, i.e. A = [F — 5- Dvai]. 

When a class is instantiated, for example by means of the new construct, 
the class constructor is called to set up the new object internal state. This can 
be modeled by a semantic function i|init] G [Din — >■ p{Ad)], where is the 
semantic domain for the constructor input values (if any). We consider sets in 
order to model non-determinism, e.g. user input. 

The semantics of a method m is a function m|m] G [Din x A — >■ p(Dout x ^)j- 
Indeed a method is called with two parameters: the method actual parameters 
and the internal state of the object it belongs to. The output of a method is 
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class StackWithUndo extends Stack { 

// Sublnv : Inv , -1 <= undoType <= 1, 

// if undoType == 1 then 0 < pos 

// else if undoType == 0 then 0 <= pos <= size 

// else if undoType == -1 then pos < size 

protected Object undoObject; 
protected int undoType ; 

StackWithUndo ( int x) { 
super ( X ) ; 

undoType = 0; undoObject = null; 

} 

void push(0bject o) throws StackError •[ 
undoType = 1 ; 
super . push ( o ) ; 

} 

void popO throws StackError •[ 
if ( ! i sEmpty ( ) ) ■[ 

undoType = - 1 ; 
undoObject = stack [pos - 1] ; 

} 

super . pop ( ) ; 

} 

// StackError never thrown 
void undo () throws StackError { 
if (undoType == -1) {. 

super .push(undoObject) ; 
undoType = 0 ; 

y else if (undoType == 1) ■[ 
super . pop ( ) ; 
undoType = 0; 

} 

} 

> 



Fig. 2. Stack class extension with undo capabilities. 



a set of pairs ( return value (if any), new object state ). The set of the initial 
states is: 



S'o = {cr I G Din. cr G i|init](u)}, 

and the method collecting semantics M|m] G [p(T’) — >• p(T’)]: 

M|m](S') = {a G T" I 3ct G S'. G D|n. 3v' G Dout- G m|m](u,CT)}. 

Then the class reachable states, c|C], are given by the least solution, w.r.t. set 
inclusion, of the following recursive equation, where n is the number of methods 
of C: 

n 

S = SoU|jMM(S). (1) 

i=l 

The former equation characterizes, according to the intuition, the set of states 
that are reachable before and after the execution of any method in any instance 
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of the class^. Thus it is a class invariant. However in general it is not computable 
so that we need to perform an abstraction in order to safely approximate c|C]. 

Two observations on the above-defined class concrete semantics. The first 
one is that the domains Dyai and Dout are left as a parameter, so that e.g. 
they can be instantiated to handle multiple parameters or to model memory and 
hence objects aliasing. The second observation is that (1) abstracts away from 
the values returned by a method. This is sound as far as the returned values do 
not expose the object internal state. If this is not the case, then the context can 
arbitrarily access and modify an object internal state. Hence, as a class invariant 
must hold for all the class objects in all the instantiation contexts then the only 
sound invariant for the exposed part of the object state is that “it can take any 
value” . As a consequence, in such a case the problem of inferring a class invariant 
reduces to the use of an escape analysis [ 2 ] to determine which part of the object 
state is exposed to the context. In the rest of the paper we study the case in 
which the object state is encapsulated, so it can be modified only by the class 
methods. 

4 Abstract Semantics 

We use the abstract interpretation framework [5] in order to compute a safe 
upper-approximation of c|C]. So, let (D^, be an abstract domain approximat- 
ing sets of states, i.e. it is linked to the concrete domain by a Galois connection: 

(p(A),Qi^(D=,C=). 

Moreover let us consider the abstract counterparts for the constructor and the 
method collecting semantics such that the initial states are approximated by 
the abstract counterpart of the constructor semantics, i.e. So C 7 (i|init]^) and 
the method semantics is soundly approximated by an abstract semantic function 
M|mi]^ G [D^ — >■ D^] such that Vd^ G D^. ao M |nii] o 7 (d^) M|mi]^(d^). Then it 

is possible to state the following theorem, that gives a characterization of class 
invariants as solutions of a system of recursive equations on that mimics ( 1 ): 

Theorem 1. Let C = (F, init, , . . .m„}) be a class, an abstract domain 
such that (p(A'),C,U) < ^ > (D^,1I^,U^). Moreover, let i|init]^ G such that 
So C 7 (i|init]^) and M|mi]^ G [D^ — >■ D®] be an abstract semantic function such 
that a o M|mi] o 7 IZ^ M|mi]^. Then / G solution of the following recursive 
equation: 

n 

7 = i|initf U" y M|m,f (/) (2) 

i=l 



is such that c|C] C 7 ( 7 ). 

^ In general, the body of a method mi may invoke a method mj that belongs to the 
same class. However, as such a call is somehow private to the class at this point the 
class invariant is not required to hold [14]. 
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The least solution of the above system of equations can be computed using 
the standard Tarski-Kleene iterations. However, in general the height of the 
domain is infinite. In such a case, a widening operator G [D^ x — >■ D^] 
must be used to force the convergence of the iterations to a post-fixpoint [5] . 

The result of Th. 1 can be extended to cope with methods that expose (a part 
of) the internal state. For example, consider a method m that returns a reference 
to an object field f. Thus the context is free to arbitrary modify the value of 
the field f . This situation can be handled by considering an abstract domain 
expressive enough to contain escape information [2] . In such a setting, the static 
analysis of m will determine that f escapes its scope and the corresponding post- 
condition li will be such that^ li |. {f } = T^. This is the only sound assumption 
if we want to infer a class invariant valid for all the contexts. 

Example 1. A class invariant for Stack can be inferred by instantiating D® with 
the Octagon abstract domain [15]: 

/ = {1 < size,0 < pos < size, size = stack. length}. 

Modularity. Our approach is, at first context-modular as the derivation of 
a class invariant is done without any hypothesis on the module calling context. 
On the other side, our approach is not fully /las-a-modular, i.e. if a class Ci 
to be analyzed has a variable (field, method formal parameter, etc.) of type Cj 
then Cj must be analyzed before Ci . However the analysis of Ci does not strictly 
require the code of Cj and it can use the class invariant and the postconditions 
of the methods: any reference in C^ to an object of type Cj can be conserva- 
tively substituted by the Cj class invariant, and any invocation of a Cj method 
can be replaced by its postcondition. The advantage of this approach is that 
if two or more distinct classes use the same class Cj then it can be analyzed 
once and its result can be used many times, speeding up the whole analysis. 
Eventually, if two or more classes depend in a cyclic way, then they must either 
be analyzed together or the circularity must be broken with the technique of 
outpoints described in [6, §8.3]. A third kind of modularity, zs-a-modularity will 
be considered in the next section. 

Full Program Analysis. In general an object-oriented program consists of 
a set of classes {Ci} and a main expression. For the moment we do not consider 
inheritance. From the set of classes we can statically derive the (inverse) graph 
of the has-a relation. This graph is such that the nodes are the classes of the 
program and there is an edge from Cj to Ci if and only if the class Ci has fields or 
local variables of type Cj . The program analysis begins with the initial nodes, i.e. 
the nodes that have no predecessor. Once these nodes are analyzed (if possible 
in parallel on distinct computation units) the successors can be considered and 
so on. If a cycle is found, then the considerations of the last section apply. Most 
of the time, the analysis of a full program does not require the analysis of all the 
classes it uses. In fact, in general a program will use some library classes. These 
can be analyzed once (as our analysis is context-modular) and the result (re-) 
used by all the programs that use of them. 

^ J I V denotes the projection of the invariant I on the set of variables V. 
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5 Class Invariant Inference in Presence of Inheritance 

Now we address the problem of inferring a class invariant for a class S = “E 
extends C” , for some base class C and extension E. The immediate method is 
to take the code of C, that of E and then to apply the analysis described in the 
previous section to the expanded class S [16]. However such a naive approach 
has many drawbacks. The first one is code blow-up. In fact, it is known [16, §6.4] 
that the expansion of the inheritance causes (in the worst case) a quadratic blow 
up of the code size. Therefore, the direct analysis of the expanded code causes 
a quadratic loss of performances. Moreover, in general C can be the base class 
for two distinct extensions E and E'. In that case the code C will be expanded 
and hence analyzed twice, with a further performances loss. Eventually, in some 
cases the C source code is not available, e.g. with applications that use third- 
party libraries. In that case, the library providers are unlikely to distribute the 
source code. A reasonable solution may be to ship the class invariants together 
with the compiled code. In that way the analysis of S will use the source code 
of E and the class invariant for C, instead of its source. In order to do it, we 
consider the two orthogonal aspects of inheritance: class extension and method 
redefinition. 

Class Extension. A subclass may extend the behavior of the superclass by 
adding methods. For the moment, we do not consider redefinition of methods, 
so that the general form of S is (Fc U Fe, init^, {mr . . .nin} U {n^ . . . nj,}), where 
Fc and nii are the fields and the methods from the base class C and Fe and 
those from the class extension E. As a consequence, the equations system (2), 
instantiated for the subclass S, becomes: 



n k 

J = ilinitEf \J MKf(J) \J MKf(J). (3) 

i=l i=l 

Now, the goal is to solve the equation above in a smarter way than performing 
a brute fixpoint computation. In particular, we are interested in a solution that 
is function of the base class invariant, so that the methods mi does not need 
to be analyzed again. In order to do it, the first remark is that the property 
Jo = i|initE]^ is about the variables in Fc U Fe, so that it can be split in two 
parts Jo = Jq LI^ Jg where the first refers to the inherited fields and the latter to 
the Fe. For what concerns Jg it is reasonable to assume Jg I, i.e. at creation 
time the subclass objects do not violate the superclass invariant This is a very 
common situation in object oriented programming: for example in C-F- 1- [7] it is 
a standard procedure to call the superclass constructors to set up the inherited 
fields and then to initialize the fields in Fe and even Java semantics [11] forces 
the initialization of the base class fields before the subclass constructor(s) can 
access them. 

The next step is to look for a solution of (3) with a particular shape. In- 
formally, S can behave either as the base class C or the extension E. Thus it is 
reasonable to look for a solution in the form of JU^ A, where / is the invariant of 

® Recall that the order relation is the abstract counterpart of logical implication. 
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the base class C and X involves just the methods of E. Formally, X is a solution 
of the following recursive equation: 



k 

x = Jo"u^ (4) 

As a consequence we obtain the following equation system: 

{(3), (4), Jo = 4 J = / X}. (5) 

It is worth noting that a solution J of (5) is a solution of (3), whereas in general 
the contrary does not hold. The interest of using (5) is that the computation 
of J reduces to the computation of (4), so that the subclass invariant J can be 
obtained using just the superclass invariant (as J = / X). Furthermore, it is 

possible to show that when analyzing a class hierarchy, in general the obtained 
speedup is linear in the number of direct descendants of a class and quadratic 
in the depth of the class hierarchy. The following theorem gives a sufficient and 
necessary condition for the existence of solutions of (3): 

Theorem 2. //M|y is a join-morphism then the equation system (5) has a 
solution ijj 



□'MKf(X) (6) 

i=l 

Proof. Using the identities of (5) it is possible to rewrite the equation (3) as 
follows: 



J =iIinitEf M|m,f (J) M|n,f (J) 

n k 

=J=U= Jo"u= |j"MKf(/U^X) U= |j"M|n,f(/U^X) 

i^l 

n n k 

=J=U= |j'M|m,f(/) U= |j'u=M|m,f(X) U= Jq" |j'M|n,f(/U= X) 

n 

=/U= y'M|m,f(X) 



i=l 



From basic lattice theory, it follows that the above equation is consistent with 
J = / U" X iff (2) holds. □ 

If the subclass preserves the parent invariant w.r.t. the inherited fields, i.e. 
X t Fc / then the above equation system admits solutions. In general it 
may be difficult to prove the theorem hypothesis, and in the worst case it is 
computationally equivalent to solve (3). However, it can be shown that a large 
class of static analyses, namely the symbolic relational ones [13,6,12] satisfies 
the hypothesis of the theorem, so that (2) must be checked once and for all. 
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Examples of symbolic relational static analyses are Octagons, Relevant Context 
Inference [4] and types. Next example shows that whilst a solution of (5) is a 
solution of (3) in general it is not the least solution. 

Example 2. The class ClosedSystem which models closed physical systems with 
a total energy cq and two different kinds of internal energy a and b , can be de- 
fined as C = ({a, b}, init, {add, sub}). The constructor is init = A(ao, Cq). (a := 
ao; b := cq — ao) and the two methods are: add = A(). (a := a -|- 1; b := 
b — 1) and sub = A(). (a := a — 1; b := b -|- 1). Using (2) and the Octagon 
abstract domain it is possible to infer the class invariant I = |a-|-b = cq}. 
Let us consider the extension Ext endedSy stem with a field c, a new constructor 
init = A(ao, cq). (super. init(ao, cq); c cq) and a method ext = A(). (c := 
c — 1; a a — 1). 

If we apply (5) we obtain X = |a-|-b = c, c < cq} and J = I Li^ 
X = (a + b < Co, c < Co}. However, the direct application of (3) gives J' = 
{a-|-b=c, c<Cq}. It is immediate to see that J' is a more precise invariant 
than J, that is J' J. □ 

Methods Redefinition. A subclass may redefine the behavior of the su- 
perclass by redefining some of its methods. The modalities for doing it largely 
depend on the considered object oriented language. For example C-I--I-, C# and 
Java apply syntax criteria (the overriding method must have the same name and 
type as the overridden) whereas in Eiffel [14] a method n overrides m if and only 
if the type of n is (co-variantly) a subtype of m. Nevertheless in order to be as 
language-independent as possible, we consider just how overriding and overrid- 
den methods interact and not how the overriding is provided by the language. 
However an implementation of an analyzer for a real language must consider this 
point. 

As usual in abstract interpretation, we proceed by successive approxima- 
tions. First we assume that all the methods of S may be executed, even those 
that are redefined, so that the results of the previous section apply directly. This 
is an over-approximation of the real behavior: if a method is redefined in a sub- 
class, then it is not directly accessible by the context. Nevertheless, the redefined 
method is still reachable by the method bodies of E. In general, because of late- 
binding, we must distinguish two situations: downcalls and rtpcalls: in the first 
case a method of the parent class invokes one that has been redefined and in 
the latter the interaction happens in the opposite direction. Once again, we can 
handle both by upper-approximating their behavior. Let us consider a method 
definition in the form m = Axin. (. . . ; v := mcan(y); • ■ • )> for some variables v 
and y. 

If the invocation of ’nicaii may resolve in a downcall, then a safe (but rather 
imprecise) approximation is to consider that an overriding method can arbitrarily 
modify the object internal state. Then the object state at the program point 
just after the assignment can be any a £ E. Therefore when performing the 
analysis of the m body, the abstract environment just after the assignment will 
be set equal to a{E) = T^, the largest element of the abstract domain. An 
improvement would be to use / instead of a{E), but then the subclass must 
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be checked to preserve the invariant I, i.e. J t Fc d. A better solution for 
down-calls in the general case is subject of future work. 

On the other hand, if mean resolves in an upcall then a worst case approx- 
imation of M|mcaZi] must be employed so that v := M|mcon]^(T^). Why this? 
Essentially because in the m body the (super)class invariant may not hold [14] so 
that it is not sound to assume it as an approximation of mean semantics. There- 
fore, if we want to analyze the subclass code without referring to the parent one 
then the only sound assumption is to consider the meaz; postcondition when its 
input is not known, i.e. it can be everything. In this last case, the base class 
must be shipped not only with the invariant / but also with an approximation 
M|mi]^(T^) for each method m^. 

Example 3. A class invariant for StackWithUndo can be inferred applying the 
considerations above and instancing the equation system (5). It is easy to see 
that the constructor preserves the Stack invariant, so Jq = /U^ {undoType = 0}. 

Then we can consider the application of (4) to undo and to the overridden 
methods push and pop. The three perform an upcall. In that case it is sound 
to replace it with the Stack invariant as it holds at the program point just 
before the super. pop and super. push invocations. Therefore it is immediate to 
obtain X = I Li^ {—1 < undoType < 1} = I Li^ X = J, the class invariant for 
StackWithUndo. 

If it is needed, the obtained invariant and method post-conditions can be im- 
proved by using the incremental refinement technique of [10]. The abstract do- 
main can be refined by partitioning it through the values assumed by undoType 
(that using the already computed class invariant J are at most 3). In such 
a way it is possible to infer the properties: undoType = 1 pos > 0 and 
undoType = 0 ^ 0 < pos < size and undoType = — 1 ^ pos < size so that it 
is proved that undo never throws the exception StackError. □ 

6 Conclusions and Future Work 

In this work we presented a framework for class modular analysis of object 
oriented languages in presence of inheritance. We derived the equations for the 
class invariant and we instantiated them in the case of inheritance. We discussed 
the resolvability whenever a solution, function of the base class invariant, is 
required. Eventually, we showed how to handle up-calls and down-calls. 

We have a prototype implementation of the framework presented in this pa- 
per with a partial support for inheritance. In particular, the fixpoint computation 
is implemented by an OCAML functor, whom parameter is an abstract domain 
for the approximation of the environments (cfr. Th. 1). From our preliminary 
tests it seems that relational domains are best-suited for effective and precise 
analyses. In fact, such domains allow to keep a relation between the constructor 
parameters and the object internal state. For instance in the Stack example the 
relation between the Stack constructor parameter, i.e. the stack size, and the 
stack pointer is inferred. On the other hand, if the analyzer is instantiated with 
the interval domain [5] the inferred class invariant is {pos G [0,-Too]}, that is 
too rough to be practical. We plan to extend our prototype implementation in 
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order to practical study its effectiveness and in particular the handling of the 
inheritance. 
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Abstract. The method of Invisible Invariants was developed originally in order 
to verify safety properties of parameterized systems fully automatically. Roughly 
speaking, the method is based on a small model property that implies it is sufficient 
to prove some properties on small instantiations of the system, and on a heuristic 
that generates candidate invariants. Liveness properties usually require well foun- 
ded ranking, and do not fall within the scope of the small model theorem. In this 
paper we develop novel proof rules for liveness properties, all of whose proof obli- 
gations are of the correct form to be handled by the small model theorem. We then 
develop abstraction and generalization techniques that allow for fully automatic 
verification of liveness properties of parameterized systems. We demonstrate the 
application of the method on several examples. 



1 Introduction 

Uniform verification of parameterized systems is one of the most challenging problems 
in verification today. Given a parameterized system S{N) : P[l]|| • • • ||F’[iV] and a 
property p, uniform verification attempts to verify S{N) \= p for every TV > 1. One 
of the most powerful approaches to verification which is not restricted to finite-state 
systems is deductive verification. This approach is based on a set of proof rules in which 
the user has to establish the validity of a list of premises in order to validate a given 
property of the system. The two tasks that the user has to perform are: 

1 . Identify some auxiliary constructs which appear in the premises of the rule. 

2. Establish the logical validity of the premises, using the auxiliary constructs identified 
in step 1. 

When performing manual deductive verification, the first task is usually the more dif- 
ficult, requiring ingenuity, expertise, and a good understanding of the behavior of the 
program and the techniques for formalizing these insights. The second task is often per- 
formed using theorem provers such as pvs [22] or sxep [4], which also require extensive 
user expertise and ingenuity. The difficulty in the execution of these two steps is the 
main reason why deductive verification is not used more extensively. 

A representative case is the verification of invariance properties using the invariance 
rule of [18]. In order to prove that assertion r is an invariant of program P, the rule 
requires coming up with an auxiliary assertion ip which is inductive (i.e. is implied by the 

* This research was supported in part by the Minerva Center for Verification of Reactive Systems, 
the European Community 1ST project “Advance”, the Israel Science Foundation (grant no. 
106/02-1), and NSF grant CCR-0205571. 
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initial condition and is preserved under every computation step) and which strengthens 
(implies) r. 

In [20,2] we introduced the method of invisible invariants, which proposes a method 
for automatic generation of the auxiliary assertion (f for parameterized systems, as well 
as an efficient algorithm for checking the validity of the premises of the invariance 
rule. In this paper we generalize the method of invisible invariants to deal with liveness 
properties. 

The method of invisible invariants is based on two main ideas: 

It is often the case that the auxiliary assertion for a parameterized system has the form (p : 
(or, more generally, Vi ^ j.g(i, j).)Weconstruct an instanceofthe parameterized 
system taking a fixed value Nq for the parameter N. For the finite-state system S{Nq), 
we compute the set of reachable states reach using a symbolic model checker. Let ri 
be the projection of reach on process index 1, obtained by discarding references to 
all variables which are local to all processes other than P[l]. We take q{i) to be the 
generalization of ri obtained by replacing each reference to a local variable P[l].x by 
a reference to P[i].x. The obtained q{i) is our candidate for the body of the inductive 
assertion (p : \/i.q{i). We refer to this part of the process as project-and-generalize. 

Flaving obtained a candidate for the inductive assertion p, we still have to check the 
validity of the premises of the invariance rule. Under the assumption that our assertional 
language is restricted to the predicates of equality and inequality between bounded 
range integer variables (which is adequate for many of the parameterized systems we 
considered), and the fact that our candidate inductive assertions have the form \/i.q{i), 
we managed to prove a small model theorem. According to this theorem, there exists 
a (small) bound Aq such that the premises of the invariance rule are valid for every 
N iff they are valid for all N < Nq. This enables us to use bdd techniques in order 
to check the validity of the premises. This theorem is based on the fact that, under the 
above assumptions, all premises can be written in the form Vi3j.'0(i, j), where 
is a quantifier-free assertion that may only refer to the global variables and the local 
variables of P[i] and P[j]. 

Being able to validate the premises on S'[Aq] has the additional important advantage 
that the user does not have to see the automatically generated auxiliary assertion p. This 
assertion is generated as part of the procedure and is immediately used in order to validate 
the premises of the rule. This is advantageous because, being generated by symbolic bdd 
techniques, its representation is often extremely unreadable and non-intuitive, and will 
usually not contribute to a better understanding of the program or its proof. Because the 
user never gets to see the auxiliary invariant, we refer to this method as the method of 
invisible invariants. 

In this paper, we extend the method of invisible invariants to apply to proofs of the 
second most important class of properties which the class of response properties. These 
are liveness properties which can be specified by the temporal formula □ (g — > Or) (also 
written as q=P~C>r) and guarantees that any (j-state is eventually followed by a r-state. 
To do so, we consider a certain variant of rule well [17] which establishes the validity 
of response properties under the assumption of justice (weak fairness). As is well known 
to users of this and similar rules, such a proof requires the generation of two kinds of 
auxiliary constructs: “helpful” assertions which characterize the states at which a certain 
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transition is helpful in promoting progress towards the goal (r), and ranking functions 
which measure the progress towards the goal. 

In order to make the project-and- generalize technique applicable to the automatic 
generation of the ranking functions, we developed a special variant of rule well, to 
which we refer as rule DistRank. In this version of the rule, we associate with each 
potentially helpful transition Ti an individual ranking function : S i— [0..c], mapping 
states to integers in a small range [0..c], where c is a hxed small constant, independent of 
the parameter N. The global ranking function can be obtained by forming the multi-set 
{i5i}. In most of the examples we considered, it was sufficient to take the range of the 
individual functions to be {0,1}, which enable us to view each Si as an assertion, and 
generate it automatically using the project-and-generalize technique. 

The paper is organized as follows: In Section 2, we present the general computational 
model of fds and the restrictions which enable to (invisibly) obtain auxiliary constructs. 
In this section we also review the small model property which enables to automatically 
validate the premises of the various proof rules. 

In Section 3 we introduce the new DistRank proof rule, explain how we automati- 
cally generate ranking and helpful assertions for the parameterized case, and demonstrate 
the techniques on the two examples of a token-ring and the bakery algorithms. The 
method introduced in this section is adequate for all the cases in which the set of reacha- 
ble states can be satisfactorily over-approximated by an assertion of the form \/i.u{i) and 
both the helpful assertion hi and the individual ranking function Si can be represented 
by an assertion of the form a{i) which only refers to the local variables of process P[i] 
and to the global variables. 

Not all examples can be handled by assertions which depend on a single parameter. 
In Section 4, we consider cases whose verihcation requires a characterization of the 
reachable states by an assertion of the form Vz.u(i) A 3j.e{j). In a future paper we 
will consider helpful assertions hi which have the form hi : Vj f With 

these extensions we can handle another version of the bakery algorithm. A variant of 
this method has been successfully applied to Szymanski’s A^-process mutual exclusion 
algorithm. 



Related Work. In [21] we introduced the method of “counter-abstraction" to automa- 
tically prove liveness properties of parameterized systems. Counter-abstraction is an 
instance of data-abstraction [13] and has proven successful in instances of systems with 
a trivial (star or clique) topologies and a small state-space for each process. The work 
there is similar to the work of the PAX group (see, e.g., [3]) which is based on the me- 
thod of predicate abstraction [5]. While there are several differences between the two 
approaches, both are not “fully automatic" in the sense that the user has to provide the 
system with abstraction methodology. 

In [23, 14] we used the method of network invariants [ 1 3] to prove liveness properties 
of parameterized systems. While extremely powerful, the main weakness of the method 
is that the abstraction is performed manually by the user. 

The problem of uniform verihcation of parameterized systems is, in general, unde- 
cidable [1]. One approach to remedy this situation, pursued, e.g., in [8], is to look for 
restricted families of parameterized systems for which the problem becomes decidable. 
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Many of these approaches fail when applied to asynchronous systems where processes 
communicate by shared variables. 

Another approach is to look for sound but incomplete methods. Representative works 
of this approach include methods based on: explicit induction ([9]), network invariants 
that can be viewed as implicit induction ([16]), abstraction and approximation of network 
invariants ([6]), and other methods based on abstraction ([10]). Other methods include 
those relying on “regular model-checking” (e.g., [12]) that are often specially effective 
due to the application of acceleration procedures but are not applciable to all cases, 
methods based on symmetry reduction (e.g., [11]), or compositional methods (e.g., ([19]) 
that combine automatic abstraction with finite-instantiation due to symmetry. Some of 
these works, from which we have mentioned only few representatives, require the user 
to provide auxiliary constructs and thus do not provide for fully automatic verification 
of parameterized systems. Others (such as regular model checking) are applicable to 
restricted classes of parameterized systems, as is our method. Almost none of these 
methods have been applied to the verification of liveness properties, which we address 
in this paper. 

Less related to our work is the work in [7] which presents methods for obtaining 
ranking functions for sequential programs. Another interesting paper which deals with 
program temination is [15]. However, the treatment there is also restricted to sequential 
program and does not address parameterized systems. 



2 The Model 

As our basic computational model, we take a. fair discrete system (fds) S = (V, 6>, p, 
J ,C), where 

• V — A set of system variables. A state of the system S provides a type-consistent 
interpretation of the system variables V. For a state s and a system variable v G V, 
we denote by s[u] the value assigned to v by the state s. Let S denote the set of all 
states over V. 

• 0 — The initial condition: An assertion (state formula) characterizing the initial 
states. 

• p{V, V') — The transition relation: An assertion, relating the values V of the va- 
riables in state s G 27 to the values V in an S'-successor state s' G 27. 

• 77 — A set of justice {weak fairness) requirements: Each justice requirement is an 
assertion; A computation must include infinitely many states satisfying the require- 
ment. 

• C — A set of compassion (strong fairness) requirements: Each compassion requi- 
rement is a pair {p, q) of state assertions; A computation should include either only 
finitely many p-states, or infinitely many q-states. 

A computation of an fds S is an infinite sequence of states cr : sq, si, S 2 , ..., satisfying 
the requirements: 

• Initiality — sq is initial, i.e., sq \= O. 

• Consecution — Eor each 7 = 0, 1, ..., the state is a S'-successor of S£. That 
is, {si, S£+i) \= p{V, V') where, for each u G H, we interpret v as s^[u] and v' as 
sf-i-iH- 




Liveness with Invisible Ranking 



227 



• Justice — for every J & J , a contains infinitely many occurrences of J-states. 

• Compassion - for every {p, q) G C, either cr contains only finitely many occurrences 
of p-states, or a contains infinitely many occurrences of g-states. 

For simplicity, all of our examples will contain no compassion requirements, i.e., C = 0. 
Most of the methods can be generalized to deal with systems with a non-empty set of 
compassion requirements. 



2.1 Fair Bounded Discrete Systems 

To allow the application of the invisible constructs methods, we place further restrictions 
on the systems we study, leading to the model of fair bounded discrete systems (fbds), 
that is essentially the model of bounded discrete systems of [2] augmented with fairness. 
For brevity, we describe here a simplified three-type model; the extension for the general 
multi-type case is straightforward. 

Let N G be the system’s parameter. We allow the following data types: 

1. bool: the set of boolean and finite-range scalars; 

2. index: a scalar type that includes integers in the range [l..iV]. These serve as indices 
to arrays and processes; 

3. par_data: a scalar data type that includes integers in the range [O..A^] ; and 

4. Arrays of the type index bool and index par_data. 

For simplicity, we allow the system variables to include any number of variables of types 
bool, index, and index bool, but at most a single variable of type index par_data, 
and no variables of type par_data. 

Atomic formulas may compare two variables of the same type. E.g., if y and y' 
are index variables, and z is a index par_data, then y = y' and z[y] < z[y'] are 
both atomic formulas. We admit 0, 1 as constants of type bool, 1,7V as constants of 
type index, and 0, N as constants of type par_data. Apparent conflicts arising out of 
the overloading of the constant symbols can always be resolved by type analysis. We 
refer to formulas obtained by boolean combinations of such atomic formulas as restricted 
boolean assertions (or boolean assertions for short). Applying quantification over index 
variables to boolean assertions, we obtain the class of restricted assertions. 

As the initial condition 0, we allow restricted assertions of the form VLm(z) A 3j.e{j), 
where u{i) and e(j) are boolean assertions. 

As the transition relation p, we allow restricted assertions of the form 
for a boolean assertion 

The allowed justice requirements are boolean assertions which may be parameterized 
by an index variable. 

The reason we distinguish between the type index and par_data, even though they 
range over similar data domains, is to avoid subscript expressions of the form a[6[z]], 
which will invalidate many of the methods developed in the sequel. 

Example 1 (The Token Ring Algorithm). 

Consider program token-ring in Fig. 1, which is a mutual exclusion algorithm for any 
N processes. 
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N 

I P[^ ■■■■ 

i=l 



in N : natural where > 1 

doc : 

loop forever do 

io : If doc = i then doc doc 1 
goto {£o,£i} 

£i : await doc = i 
£2 ■ Critical 



Fig. 1. Program token-ring 



In this version of the algorithm, the global variable tloc represents the current lo- 
cation of the token. Location £q constitutes the non-critical section which may non- 
deterministically exit to the trying section at location l\. While being in the non-critical 
section, a process guarantees to move the token to its right neighbor, whenever it receives 
it. This is done by incrementing tloc by 1, modulo N. At the trying section, a process 
P[i] waits until it received the token which is signaled by the condition tloc = i. 

Following is the fbds corresponding to program token-ring: 

. f tloc : [1..7V] 

■ I 7T : array[1..7V] of [0..2] 

0 : Vi.7r[i] = 0 

p\ 3i :\/j (7r'[j] = 7r[j]) A 



7r[i] = 0 A tloc 


= i A tloc' = i ©„ 1 A Tr'[i] G {0, 1} 1 




Tr'[i] = 7r[i] 






V Trill = 0 A tloc i A Ti'\i\ = 1 




V tloc = tloc A 


L J / L J 

V 7t[i] = 1 a tloc = i A 7r'[i] = 2 






^ V 7r|i] = 2 A 7 t'[i] = 3 







' {Jo[*] : “'(^[*] = 0 A tloc = i 


|zG[L.iV]} U' 


J : 


{Ji[i] : ~'(7r[i] = 1 A tloc = i 


iG[l..iV]} U 






|zG[l..iV]} 



Note that tloc is a variable of type index, while the program counter tt is a variable of 
type index bool. 

Strictly speaking, the transition relation as presented above does not fully conform to 
the definition of a boolean assertion since it contains the atomic formula tloc' = f ©„ 1 . 
Flowever, this can be rectified by a fwo-sfage reduction. First, we replace tloc' = f 1 
by {i < N A tloc' = f+ l) V (i = N A tloc' =1). Then, we replace the formula 
3i :'ij ^ i : . tloc = i + 1 . . . ) by 

ii : Vj ^ i,ji ■ {ji < i V Zi < ji) A (. . . tloc = i\ . . .) which guarantees that 
ii = i + l. 



Let a be an assertion over V, and i? be an assertion over VVJV' , which can be viewed 
as a transition relation. We denote hy aoR the assertion characterizing all state which 
are i?-successors of a-states. We denote by a o i?* the states reachable by an i?-path of 
length zero or more from an a -state. 
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2.2 The Small Model Theorem 

Let If : j) be an AE-formula, where R{i,j) is a boolean assertion which refers 

to the state variables of a parameterized fbds S{N) and to the quantified variables i and 
j. Let No be the number of universally quantified and free index variables and constants 
appearing in R. We always include in the count Nq the index constants 1 and N, even if 
they do not appear explicitly in R. The following claim (stated first in [20] and extended 
in [2]) provides the basis for the automatic validation of the premises in the proof rules: 

Claim (Small model property). 

Formula (f is valid iff it is valid over all instances S{N) for N < Nq. 

Proof Sketch 

To prove the claim, we consider the negation of the formula, given by f : 3i\/j.-'R{i, j ) , 
and show that if is satisfiable, then it is satisfiable in a state of a system S{N) for 
some N < Nq. Assume that f is satisfiable in state s^Vi of an instance S{Ni). Let 
1 = Ml < U 2 < • • • < Ufc = iV be the sequence of pairwise distinct values of index 
variables and constants which appear free or existentially quantified within tp, sorted 
in ascending order. Obviously k < Nq. We construct a state Sk in the system S{k) as 
follows. Each index variable or constant that was interpreted as Ui in sni is reinterpreted 
as i in Sfc. For each array a : index bool and each i = 1, . . . ,k, we reinterpret Sfc[a[i]] 
as SATj[a[ui]]. In case the system contains a (single) array b : index i— par_data, we 
also perform the following reduction: Let S be the set of all values which appear as the 
interpretation of &[mi] , . . . , b[uk] in sivi . to which we add the constants 0 and N = Ni. 
Let 0 = Mo < wi < • • • < Mm = A^i be a sorted list of the values appearing in S. 
Obviously, m < k. We reinterpret the values of b[i] in Sk as follows. If sni [^[tti]] = 
Vj < Ni,we let Sfe[6[i]] = j. If sni [^[wi]] = A^i, we let Sfe[6[i]] = k. 

With the understanding that arrays in the system represent the local variables of 
individual processes, the reduction from a state of system S{Ni) to a state of system 
S{k), amounts to the preservation of processes P[ui], . . . , P[Mfc], renaming their indices 
to P[l], . . . , P[k], and discarding any process P[j] in Sni such that j ^ {mi, . . . , m^}. 
Discarding these processes preserve the validity of ip because the reference to these 
“other” processes is only through a universal quantification Vj. So narrowing the set 
over which we universally quantify can only make the property “truer”. This explains 
why the method is restricted to AE-formulas. 

It follows that state Sk satisfies ip. □ 

In case the formula (p refers to an expression such as i + 1, we add 1 more variable which 
participates in the re-expression of i + 1. 



3 The Method of Invisible Ranking 

In this section we present a new proof rule for liveness properties of an fbds that allows, 
in some cases, to obtain an automatic verification of liveness properties for systems of 
any size. We first describe the new proof rule. We then present methods for the automatic 
generation of the auxiliary constructs required by the rule. We illustrate the application 
of these method on the example of token-ring as we go along. 
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3.1 A Distributed Ranking Proof Rule 

In Fig. 2, We present proof rule DistRank (for distributed ranking), for verifying re- 
sponse properties. This rule assumes that the system to be verified has only justice 
requirements (in particular, it has no compassion requirements.) 

For a parameterized system with a transition domain T (N) 
set of states S{N), 
justice requirements {Jt \ r G T}, 
invariant assertion 99 , 
assertions q, r, pend and {hr \ r G T}, 
and ranking functions {5^:27— >-{0,l}|rG T} 

Dl. q A (p -A r V pend 
D2. pend A p r' V pend' 

D3. pend 

D4. pend A p -A- r' V AtgT 
For every t gT 

D5. hr A p -A- r' V h'r V Sr > S'r 
D6. hr — ^ 'Jr 

q ^ Or 

Fig. 2. The liveness rule DistRank 

The rule is configured to deal directly with parameterized systems. As in other rules 
for verifying response properties (e.g.,[17]), progress is accomplished by the actions of 
helpful transitions in the system. In a parameterized system, the set of transitions has 
the structure T{N) = [0..m] x N for some fixed m. Typically, [0..m] enumerates the 
locations within process. For example, in program token-ring, T{N) = [0..2] x N, 
where each transition Tk[i] is associated with location m G [0..2] within process i G 
[1..7V]. Each transition Tk[i] is affiliated with a justice requirement Jm[z], asserting 
that transition Tk[i] is disabled. By requiring that Jk\i] holds infinitely many times, 
we guarantee that [i] will be disabled infinitely many times, thus Tk [*] will not be 
continuously enabled without being taken. 

Assertion is an invariant assertion characterizing all the reachable states. Assertion 
pend characterizes the states which can be reached from a reachable q-state by an r-free 
path. For each transition r, assertion hr characterizes the states at which r is helpful. 
These are the states s that have a p-successor s' such that t is enabled on s and disabled 
on s', and the transition from s to s' leads to a progress towards the goal. This progress 
is observed by immediately reaching the goal or a decrease in the ranking function 5r, 
as stated in premise D5. The ranking functions 5r are used in order to measure progress 
towards the goal. The disabling of r is often due to the taking of this transition, but may 
also be caused by some data condition turning false. We require decrease in ranking in 
both cases. 

Premise Dl guarantees that any reachable q-state satisfies r or pend. Premise D2 
guarantees that any successor of a pend-state also satisfies r or pend. Premise D3 
guarantees that any pend -state has at least one transition which is helpful in this state. 
Premise D4 guarantees that ranking never increases on transitions between two pend- 
states. Note that, due to D2, every p-successor of a pend-state which has not reached 





Liveness with Invisible Ranking 



231 



the goal is also a penrf-state. Premise D5 guarantees that taking a step from an ha- 
state leads into a state which either already satisfies the goal r, or causes the rank 6r to 
decrease, or is again an /it- - state. Premise D6 gnarantees that all h,- -states violate Jr- 
Together, premises D5 and D6 imply that the computation cannot stay in hr forever 
without violating justice w.r.t Jr- Therefore, the computation must eventually move to 
a -state, causing 6r to decrease. Since there are only finitely many 5r and until the 
goal is reached they monotonically decrease, we can conclude that eventually an r-state 
is reached. 



3.2 Automatic Generation of the Auxiliary Constructs 

We now proceed to show how all the auxiliary constructs necessary for the application 
of rule DistRank can he automatically generated. Note that we have to construct a 
symbolic version of these constructs, so that the rule can be applied to a generic N . 

For simplicity, we assume that the response property we wish to establish only refers 
to the local variables of process P[l] . (The approach can be easily generalized to the case 
when the response property refers to the local variables of process P[z], for an arbitrary 
2 : G [l..iV].) A case in point is the verification of the property at_£i[l] =^Oaf_^ 2 [l] 
for program token-ring. This property claims that every state in which process P[l] is 
at location is eventually followed by a state in which process P[l] is at location l^. 
Assume that the parameter domain has the form [0..m] x N . 

Since the special process is P[l] , we would expect the constructs to have the symbolic 
forms Lp : and pend : pend^i A Vi l.pemi^i(i). For each k G [0..m], 

we need to compute [1], and the generic h^[i], [i], which should be symbolic 

in i and apply for all i, 1 < i < N. All generic constructs are allowed to refer to the 
global variables and to the variables local to P[l] . We will consider each of the auxiliary 
constructs and provide a methodology for its generation which will be illustrated on the 
case of program token-ring. 

The constrnction uses the instantiation S{Nq) for the cutoff value Ng required in 
Claim 2.2. In onr case, this yields Aq = 6 as explained below. We denote by 0^ and 

the initial condition and transition relation for S{Nq). The construction begins by 
computing the concrete auxiliary constructs for S{Nq) which we denote by (p^, pend^, 
h^[j], and 6^[j]. We then use project-and-generalize in order to derive the symbolic 
(abstract) versions of these constructs: pend^, h^[j], and S^[j]. 

Computing Compute the assertion reach = 0^ o p* characterizing all states 

reachable by system S{Nq). We take p^{i) = reach^[3 i-A- i], which is obtained by 
projecting reach^ on index 3, and then generalizing 3 to i. 

For example, in token-ring(6), reach ^ = /\^^i{at-£o,i[j] V the = j). The pro- 
jection of reach ^ on j = 3 yields (at_Lo,i[3] V the = 3). The generalization of 3 to i 
yields : at_£o,i [*] V the = i. Note that when we generalize, we should generalize 
not only the values of the variables local to P[3] but also the case that the global variable, 
such as the, has the value 3. The choice of 3 as the generic value is arbitrary. Any other 
value would do as well, but we prefer indices different from 1, N . 

In this part we computed p^{i) as the generalization of 3 into i in reach^, which 
can be denoted as p^{i) = reach^[3 1 — >■ i]. In later parts we may need to generalize 
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two indices, such as [2 i— >■ i, 4 i— >■ j], where and are a concrete and 

abstract versions of some assertion a. The way we compute such abstractions over the 
state variables tloc and tt of system token-ring is given by 

i<j A 3tloc' ,tt' : a^{tloc ,tt') A map(2,i,4,j), 

' 7r[i] = 7 t'[2] a 7r[j] = 7 t'[4] A 

tloc = i tloc' = 2 A tloc = j tloc' = 4 A 

^ tloc < i tloc < 2 A tloc < j tloc < 4 

Note that this computation is very similar to the symbolic computation of the predecessor 
of an assertion, where map{2, i, 4, j) serves as a transition relation. Indeed, we use the 
same module used by a symbolic model checker for carrying out this computation. 

Computing pend Compute the assertion = {reach ^ Aq A^r) o{p ^ A^r)* , 

characterizing all states which can be reached from a reachable (( 7 A-ir)-state by an r-free 
path. Then we take = pend^[l i-A- 1], and pend^i(z) = pend^[l i-A- 1,3 i-A- i]. 

Thus, for token-ring(6), pendp = reach^^ A We therefore take : 

at-^l[l] and pend^i(i) : A (at_fo.i[*] V tloc = i). 

Computing h^[i]‘ First, we compute the concrete helpful assertions This is 

based on the following analysis. Assume that set is an assertion characterizing a set of 
states, and let J be some justice requirement. We wish to identify the subset of states 
(j) within set for which the transition associated with J is an escape transition. That is, 
any application of this transition to a (/>-state takes us out of set. Consider the fix-point 
equation: 



a^(f/oc,7r) = 

where, 

map{2,i,4,j) = 



(j) = set A -iJ A AX{(j) V -'set) (1) 

The equation states that every 0-state must satisfy set A -'J, and that every successor of a 
0-state is either a 0-state or lies outside of set. In particular, all J-satisfying successors of 
a 0-state do not belong to set. By taking the maximal solution of this fix-point equation, 
denoted iy(f){set A -•J A AX{(j> V -•set)), we compute the subset of states which are 
helpful for J within set. 

Following is an algorithm which computes the concrete helpful assertions {h‘^[j]} 
corresponding to the justice requirements { [j]} of system S{Nq). For simplicity, we 

will use T € T{Nq) as a single parameter. 

for each r G T{Nq) do hr := 0 
set := pend^ 

for all r G T (Nq) satisfying v(j){set A -•Jr A AX (0 V -•set) ) ^ 0 do 
hr := hr V v(j){set A -•Jr A AX{<j)\/ -•set)) 
set := set A -•hr 

The “for all t G T(Wo)” iteration terminates when it is no longer possible to find a 
r G T(iVo) which satisfies fhe non-empfiness requirement. The iteration may choose 
the same r more than once. When the iteration terminates, set is 0, i.e., for each of the 
states covered under pend^ there exists a helpful justice requirement which causes it to 
progress. 
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Having found the concrete h^\j], we proceed to project-and-generalize as follows: 
for each k G we let h^[l] = '->■ 1] and h^[i] = i-i- 1, 3 i-i- i]. 

Applying this procedure to token-ring(6), we obtain the following symbolic helpful 
assertions: 





htm 


h^[i\,i > 1 


fc = 0 


0 


A at-£o[i] A tloc = i 


k = l 


A tloc = 1 


at-ti[l] A at-£i[i] A tloc = i 


k = 2 


0 


at-ti[l] A at-£ 2 [i] A tloc = i 



Computing [i] : As before, we begin by computing the concrete ranking functions 

We observe that should equal 1 on every state for which Tk[j] is helpful 
and should decrease from 1 to 0 on any transition that causes a helpful Tk[j] to become 
unhelpful. Furthermore, S^[j] can never increase. It follows that S‘^[j] should equal 
1 on every pending state from which there exists a pending path to a pending state 
satisfying h^[j]. Thus, we compute = pend^ A ((-r) EU h^[j]), where E14 is 
the “existential-until” ctl operator. This formula identihes all states from which there 
exists an r-free path to a (h^[j])-state. 

Having found the concrete S^[j], we proceed to project-and-generalize as follows: 
for each k G [0..m], we let i-A 1] and = <5^[3] [1 i-A 1, 3 i-A z]. 

Applying this procedure to token-ring(6), we obtain the following abstract ranking 
functions: 

5 ^[ 1 ] = [ 1 ] : 0 

5^[1] : ar_fi[l] 

(Iq [zj : A (1 < tloc < i A at_fo,i[*] V tloc = i) '1 

A (1 < tloc <i A at_fo,i[*] V tloc = i A at-ii[i\) > for z > 1 
[*] : at-fi[l] A (1 < tloc < i A at_fo,i[*] V tloc = i A at_fi_ 2 [*]) J 



3.3 Validating the Premises 



Having computed internally the auxiliary constructs, and checking the invariantce of p, 
it only remains to check that the six premises of rule DistRank are all valid for any 
value of N . Here we use the small model theorem stated in Claim 2.2 which allows us 
to check their validity for all values of TV < iVo for the cutoff value of Aq which is 
specified in the theorem. First, we have to ascertain that all premises have the required 
AE form. For auxiliary constructs of the form we have stipulated in this Section, this 
is straightforward. Next, we consider the value of Aq required in each of the premises, 
and take the maximum. Note that once p is known to be inductive, we can freely add it 
to the left-hand-side of each premise, which we do for the case of Premises D5 and D6 
that, unlike others, do not include any inductive component. 
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Usually, the most complicated premise is D2 and this is the one which determines the 
value of Nq. For program token-ring, this premise has the form (where we renamed 
the quantified variables to remove any naming conflicts): 

((Va.penrf(a)) A -)> / V c.pend{c)) , 

which is logically equivalent to 

'ii,ii,c3a,j,ji.{pend{a) A V'(*Ui,J.ji) f' y pend{c)^ 

The index variables which are universally quantified or appear free in this formula are 
{i, ii, c, the, 1, N} whose count is 6. It is therefore sufficient to take Nq = 6. Having 
determined the size of Nq, it is straightforward to compute the premises of S{N) for all 
N < No and check that they are valid, using bdd symbolic methods. 

The same form of auxiliary constructs can be used in order to automatically verify 
algorithm bakery(N), for every N. However, this requires the introduction of an auxi- 
liary variable minJd into the system, which is the index of the process which holds the 
ticket with minimal value. In a future work we will present an extension of the method 
which enables us to verify bakery with no additional auxiliary variables. 

4 Cases Requiring an Existential Invariant 

In some cases, assertions of the form \/i.u{i) are insufficient for capturing all the relevant 
features of the constructs (pA and pend^, and we need to consider assertions of the form 
A 3j.e(j). Consider, for example, program channel-ring, presented in Fig. 3. 



in N : natural where > 1 



chan : array[l..At] of boolean where chan[i] = {i = 2) 





r loop forever do 


■ 


N 




io '■ it chan[i] then {chan[i], chan[i ©„ 1]) ;= (0, 1) 




P[z] :: 




goto 




i=l 




£i : await chan[i] 








£2 ■ Critical 





Fig. 3. Program channel-ring 



In this program the location of the token is identified by the index i such that chan [z] = 
1. Computing the universal invariant according to the previous methods we obtain ifA : 
Vz.(af_^o.i V chan[i]), which is inductive but insufficient in order to establish the 
existence of a helpful transition for every pending state. 

Using a recent extension to the invariant-generation method, we can now derive 
invariants of the form Vz.zz(z) A 3j.e(j). Applying this method to the above example, 
we obtain 

: Vz.(at_fo,i V chan[i]) A 3j.chan[j] 
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Using this extended form of an invariant for both and pend^, we can complete the 
proof of program channel-ring using the methods of Section 3. 

We provide a sketch of the extension which enables the computation of an invariant 
such as Vz.u(i) A 3j.e{j). As before, we pick a value Nq, instantiate S{Nq) and use 
the “invisible invariant” method to derive an inductive assertion Vz.u(z). As part of this 
computation we also compute reach ^ which captures all the states reachable in S'(iVo). 
Since we believe that there is still an existential invariant as part of reach ^ , the assertion 
must be a strict over- approximation of reach^. Below, we list the sequence of 
steps we take in order to isolate the assertion e{j). On the right of this sequence of steps, 
we list the results of these computations for the case that reach^ equals precisely the 
conjunction zz(z) A V^ie(j)- 

Algorithm Results when reach ^ = A^ u{i) A V^- e(j) 

«i := K u{i) A ^reach^ = /\ - u{i) A Aj “'e(j) 

a2 ■= ai[l I— >■ fc] a2 = u{k) A ~<e{k) 

as := -<a2 as = u{k) — >■ e{k) 

Thus, we compute 3k.as{k) as the candidate for an existential invariant. Note that, while 
we did not succeeded in precisely isolating e(fc), we computed instead the implication 
e{k) — >■ u{k) and, since \/i.u{i) is invariant, and the index range in nonempty, 3 k.e{k) 
is invariant iff 3 fc.e( A:) — >■ u{k) is. 

This technique of obtaining an existential conjunct to an auxiliary assertion 
can be used for other auxiliary constructs. Applying the method of invisible ran- 
king, with the new addition, to program channel-ring and the response property 
A [1]> we obtain, for example, : at_ii[l] A and for z > 1, 

/Z 2 W : 3j < i.chan\j] A at_Li[z] A Thus, Premise D3 becomes: 

at_ii[l] A Vz.(at_A,i V chan[i\) A 3j.chan[j] — >■ at_£i[l] A 3j.chan[j] 

which is obviously valid. 



5 The Bakery Algorithm 

As the hnal example of the application of the invisible-ranking method we consider 
program bakery, presented in Fig. 4. This version is a variant of Lamport’s original 
Bakery Algorithm that offers a solution of the mutual exclusion problem for any N 
processes. The operation “y :=maximal value to y[z] while preserving order of elements" 
is defined by 

Vj, k : y'[j] < y'\i] A {y[j] = 0 O y'[j] = 0) A {y[j] < y[k] O y'[j] < y'[k]) 

where z, j, and k are mutually distinct. This assignment is in general non-deterministic. 
Here, tt, the program location array, is of type index i-A bool, and y, the “ticket” array, 
is of type index i-> par_data. 

The program contains the auxiliary variable minJd which is expected to hold the 
index of a process whose y value is minimal among all the positive j/-values. The 
maintaining construct implies that this variable is updated, if necessary, whenever some 
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N 

P[i] :: 

i=l 



in N : natural where N > 1 
local y : array [l-.A'"] of [0..A''] where y = 0 
minJd : natural where minJd = 1 
loop forever do 

0 : NonCritical 

1 : y maximal value to y[i] while preserving order of elements 

2 : await Vj : {y[f\ = 0 V y\j] < y\i]) 

3 : Critical 

_4 : y[{] := 0 

maintaining Vj ■ y[j] = 0 V 0 < y [minJd] < y[j] 



Fig. 4. Program bakery 



y variables change their values. Already in [20] we pointed out that in some cases, it 
is necessary to add auxiliary variables in order to find inductive assertions with fewer 
indices. This version of bakery illustrates the case that such auxiliary variables may 
also be needed in the case of the invisible ranking method. 

The property we wish to verify for this parameterized system is at-£i[z] 
Oat-fa]^] which implies accessibility for an arbitrary process P[z]. 

Having the auxiliary variable minJd as part of the system variables, we can proceed 
with the computation of the auxiliary constructs following the recipes explained in Sec- 
tion 3: After some simplifications, we can present the automatically derived constructs 
as follows: 






pend^ : 



{hi[f\)A 



m])A 



Vi : {at_£Q i[i\ o y[i] = 0) A -J minJd = i) A 

( {minJd = i y[f\ > y[i] A y[i] ^ 0 V y[j] = 0) A 

^ ^ ^ ■ I (J/W = y[j\ ^ y[i\ = 0) 

ip^ A at-£i^2[z] 



(£ 


For j = z 


For j ^ z 


1 


at-£i[z] 


0 


2 


at-£ 2 [z] A minJd = z 


at-£ 2 [z] A minJd = j A at_f 2 [j] 


3 


0 


at-£2[z] A at-£^[j\ 


V4 


0 


at-£2[z] A at-£i[j] 



(£ 


For J = z 


For j ^ z 


1 


at_£i[z] 


0 


2 


at-£i^2[z] 


at_£i[z] V at_£ 2 [z] A at_.f 2 [j] A y[z] > y[f\ 


3 


0 


at_£i[z] V at^£ 2 [z] A at-£ 2 , 3 [j] A y[z] > y[j] 


V4 


0 


at^£i[z] V at^£ 2 [z] A af_^ 2 .. 4 [j] A y[z] > y[j] 



Using these derived auxiliary constructs, we can verify the validity of the premises of 
rule DistRank over S'(5) and conclude that the property of accessibility holds for every 
value of N. 
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6 Conclusion and Future Work 

The paper showed how to extend the method of invisible invariants to a fully automatic 
proof of parameterized systems S{N) for any value of N. We presented rule DistRank, a 
distributed ranking proof rule that allows an automatic computation of helpful assertions 
and ranking functions, which can be also automatically validated. 

The current method only deals with helpful assertions which depend on a single 
parameter. In a future work, we plan to present a rule which enables us to deal with 
helpful assertions of the form hi : Vj and extend its applicability to systems 

with strong fairness requirements. 
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Abstract. We present an automated method for proving the termina- 
tion of an unnested program loop by synthesizing linear ranking func- 
tions. The method is complete. Namely, if a linear ranking function exists 
then it will be discovered by our method. The method relies on the fact 
that we can obtain the linear ranking functions of the program loop as 
the solutions of a system of linear inequalities that we derive from the 
program loop. The method is used as a subroutine in a method for prov- 
ing termination and other liveness properties of more general programs 
via transition invariants; see [PROS]. 



1 Introduction 

The verification of termination and other liveness properties of programs is a 
difficult problem. It requires the discovery of invariants and ranking functions 
to prove the termination of program loops. 

We present a complete and efficient method for the synthesis of linear ranking 
functions for unnested program loops whose guards and update statements use 
linear arithmetic expressions. We have implemented the method. Preliminary 
experiments show that the method is efficient not only in theory but also in 
practice. 

Roughly, the method works as follows. Given a program loop for which we 
want to find a linear ranking function, we construct a corresponding system 
of linear inequalities over rationals. As we show, the solutions of this system 
encode the linear ranking functions of the program loop. That is, we can check 
the existence of a linear ranking function by constraint solving. If it exists, a 
linear ranking function can be constructed from a solution of the system of 
linear inequalities, a solution that we obtain by constraint solving. If the system 
has no solutions then (and only then) a linear ranking function does not exist. 
As a consequence of our approach, one can use existing highly-optimized tools 
for linear programming as the engine in a complete method (to our knowledge 
the first) for the synthesis of linear ranking functions. 

We admit unnested program loops with nondeterministic update statements. 
This is potentially useful to model read statements. It is strictly required in the 
context where we employ our method, described next. 

In a work described elsewhere [PROS], we show that one can reduce the 
test of termination and other liveness properties (in the presence of fairness 
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assumptions) to the test of termination of unnested program loops. That is, 
we use the algorithm described in this paper as a subroutine in the software 
model checking method for liveness properties via transition invariants proposed 
in [PROS]. The experiments that we present in this paper stem from this context. 



2 Unnested Program Loops 

We formalize the notion of unnested program loops by a class of programs that 
are built using a single “while” statement and that satisfy the following condi- 
tions: 

— the loop condition is a conjunction of atomic propositions, 

— the loop body may only contain update statements, 

— all update statements are executed simultaneously. 

We call this class simple while programs. Pseudo-code notation for the programs 
of this class is given below. 

while {Condi and . . . and Condm) do 
Simultaneous Updates 

od 

We consider the subclass of simple while programs built using linear arith- 
metic expressions over program variables. 

Definition 1. A linear arithmetic simple while (LASW ) program over the tuple 
of program variables x = {x\, . . . ,Xn) is a simple while program such that: 

— program variables have integer domain, 

— every atomic proposition in the loop condition is a linear inequality over 
(unprimed) program variables: 



C\Xi 4- * * * -f CnXn ^ Cq, 

— every update statement is a linear inequality over unprimed and primed pro- 
gram variables 



a'lx'i -\ -I- a'^x'n < a\Xi -I- • • • -I- a„a;„ -I- oq. 

Note that we allow the left-hand side of an update statement to be a linear 
expression over program variables, and that an update can be nondeterministic, 
e.g., x' -\-y' < x-\-2y— 1. This is necessary, because we use simple while programs, 
and LASW programs in particular, to approximate the transitive closure of a 
transition relation (see Section 4). 

We define a program state to be a valuation of program variables. The set 
of all program states is called the program domain. The transition relation de- 
noted by the loop body of an LASW program is the set of all pairs of program 
states (s, s') such that the state s satisfies the loop condition, and (s, s') satisfies 
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each update statement. A trace is a sequence of states such that each pair of 
consecutive states belongs to the transition relation of the loop body. 

We observe that the transition relation of a LASW program can be expressed 
by a system of inequalities over unprimed and primed program variables. The 
translation procedure is straightforward. For the rest of this paper, we assume 
that an LASW program over the tuple of program variables x = {x\, . . . ,Xn) 
(treated as a column vector) can be represented by the system 

(A.4')(*) <0 

of inequalities. We identify an LASW program with the corresponding system 
of inequalities. 

Example 1 . The following program loop with nondeterministic updates 

while {i — j > 1) do 

{i,j) := {i - NatJ + Pos) 

od 

is represented by the following system of inequalities. 

-i + j< -1 
-i + i' <0 

Note that the relations between program variables denoted by the nondetermin- 
istic update statements i := i — Nat and j '■= j + Pos, where Nat and Pos stand 
for any nonnegative and positive integer number respectively, can be expressed 
by the inequalities i' < i and j' > j + 1 - 



3 The Algorithm 



We say that a simple while program is terminating if the program domain is 
well-founded by the transition relation of the loop body of the program, i.e., 
if there is no infinite sequence of program states such that each pair 

(sj, Si+i), where f > 1, is an element of the transition relation. 

The following theorem allows us to use linear programming over rationals 
to test the existence of a linear ranking function, and thus to test a sufficient 
condition for termination of LASW programs. The corresponding algorithm is 
shown in Figure 1. 

Theorem 1. A linear arithmetic simple while program given by the system 
{AA')C) < b is terminating if there exist nonnegative vectors over rationals 
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input 

program {AA') (^,) < b 

begin 

if exists rational-valued Ai and A 2 such that 

Ai, A 2 ^0 

AiA' = 0 
(Ai — A2)A = 0 
A2(A + A') = 0 
A 2 & < 0 

then 

return( “Program Terminates”) 

else 

return) “Linear ranking function does not exist”) 

end. 



Given Ai and A2, solutions of the systems above, define 
r A2A', So — Aife, and S '= — A26. A linear ranking 

function p is defined by 



, , def rx 
p{x) = 



5 o-S 



if exists x' such that (AA')()),) < 6, 
otherwise. 



Fig. 1. Termination Test and Synthesis of Linear Ranking Functions. 



Ai and X2 such that the following system is satisfiable. 



AiA' = 0 


(la) 


(Ai — A2)A = 0 


(lb) 


A2(A + A') = 0 


(Ic) 


A2& < 0 


(Id) 


Proof. Let the pair of nonnegative (row) vectors Ai and A2 be a solution of the 
system (la)-(ld). For every x and x' such that (AA')(®,) < b, by assumption 
that Ai > 0 , we have Ai(AA')(“,) < Ai&. We carry out the following sequence of 
transformations . 


\\{Ax + A' x') < \\b 
X\Ax X\A X ^ X\b 




XiAx < Xib 


by (la) 


X 2 AX ^ Xib 


by (lb) 


—X2A'x < Xib 


by (ic) 
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From the assumption A2 > 0 follows A2(4lA')(“,) < A2&. Then, we continue with 
\2{Ax + A' x') < A26 

X 2 AX X 2 A X ^ A2& 

—X 2 A'x + X 2 A'x' < A26 by (Ic) 

We define r A2^', i5o —Xib, and <5 — A26. Then, we have rx > <5o and 

rx' < rx — 6 for all x and x' such that {AA') ( “,) < b. Due to (Id) we have (5 > 0. 

We define a function p as shown in Figure 1. Any program trace induces a 
strictly descending sequence of values under p that is bounded from below, and 
the difference between two consecutive values is at least 5. Since no such infinite 
sequence exists, the program is terminating. □ 

The theorem above states a sufficient condition for termination. We observe 
that if the condition applies then a linear ranking function, i.e., a linear arith- 
metic expression over program variables which maps program states into a well- 
founded domain, exists. The following theorem states that our termination test 
is complete for the programs with linear ranking functions. 

Theorem 2 . If there exists a linear ranking function for the linear arithmetic 
simple while program with nonempty transition relation then the termination 
condition of Theorem 1 applies. 

Proof. Let the vector r together with the constants 60 and 5 > 0 define a linear 
ranking function. Then, for all pairs x and x' such that {AA') (“,) < 6 we have 
rx > (5o and rx' < rx — 5. 

By the non-emptiness of the transition relation, the system {AA') {fff) < b 
has at least one solution. Hence, we can apply the ‘affine’ form of Farkas’ lemma 
(in [Sch86]), from which follows that there exists (5g and 6' such that J'q > So, 
S' > S, and each of the inequalities —rx < —Sq and —rx + rx' < —S' is a 
nonnegative linear combination of the inequalities of the system {AA') < b. 

This means that there exist nonnegative rational-valued vectors Ai and A2 such 
that 

Xi{AA')Q,) = -rx 

Xib = -S'o 

and 

X2{AA'){^f,) = -rx + rx' 

X2b=-S'. 

After multiplication and simplification we obtain 

AiA=-r AiA' = 0 

A2A = — r A2A' = r, 



from which equations (la)-(lc) follow directly. Since <5' > <5 > 0, we have A2& < 0, 
i.e., the equation (Id) holds. □ 
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The following corollary is an immediate consequence of Theorems 1 and 2. 

Corollary 1. Existence of linear ranking functions for linear arithmetic simple 
while programs with nonempty transition relation is decidable in polynomial time. 

Not every LASW program has a linear ranking function (see the following 
example). 

Example 2. Consider the following program. 

while {x > 0) do 

X := —2x + 10 

od 



The program is terminating, but it does not have a linear ranking function. 
For termination proof consider the following ranking function into the domain 
{0, . . . , 3} well-founded by the less-than relation <. 



p{x) 



def 



ri if XG {0,1, 2}, 
I 2 ifxG{4,5}, 

I 3 if a; = 3, 

[O otherwise. 



It can be easily tested that the system (la)-(ld) is not satisfiable for the LASW 
program 






By Theorem 2, this implies that no linear ranking function exists for the program 
above. □ 



The following example illustrates an application of the algorithm based on The- 
orem 1. 



Example 3. We prove termination of the LASW program from Example 1. The 
program translates to the system {AA') ( “,) < h, where: 
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Let Ai = (A'i,A2,Ay and A2 = (A'/jA^'jAg). The system (la)-(ld) is feasible, it 
has the following solutions: 

A' = A' = A'/ = 0 , 
a; = A" = A'', 

A'i,A",A"> 0. 

Since the system is feasible the program is terminating. We construct a linear 
ranking function following the algorithm in Figure 1. We define r X2A', 
(5o — A16, and S — A26, and obtain r = (A^ — A'^), 6 q = S = X[. Taking 

A^ = 1 we obtain the following ranking function. 

/. .X j * - j if * - J > 1, 
jo otherwise. 

4 Application to General Programs 

In this section we illustrate how our method for proving termination of program 
loops can be used in the software model checking method for liveness properties 
via transition invariants proposed in [PROS]. That method applies to general- 
purpose programs (imperative, concurrent, ... ); it is different from other ap- 
proaches to special classes of infinite-state systems, e.g. [BS99]. We then provide 
experimental results obtained by applying the transition invariants approach for 
proving termination of singular value decomposition program. 

Software model checking for liveness properties is a new approach for the au- 
tomated verification of liveness properties of infinite-state systems by the com- 
putation of transition invariants. A transition invariant is an over-approximation 
of the transitive closure of the transition relation of the system. The presenta- 
tion of a transition invariant as nothing but a finite set of unnested program 
loops. One can characterize the validity of a liveness property via the existence 
of transition invariants [PROS]. Namely, the liveness property is valid if each of 
the unnested program loops is terminating. 

That is, the general method for the verification of liveness properties de- 
scribed in [PROS] is parameterized by an algorithm that tests whether each 
unnested program loop in the transition invariant is terminating, i.e., a proce- 
dure implementing a termination test for simple while programs. 

Proving termination of simple while programs built using linear arithmetic 
expressions is required for the verification of a large class of software systems, 
e.g., liveness properties for mutual exclusion protocols (bakery, ticket), termi- 
nation proofs of imperative programs (sorting algorithms, numerical algorithms 
dealing with matrices). 

4.1 Sorting Program 

This example illustrates the approach from [PROS] and the role of simple while 
programs. 
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We consider the program shown in Figure 2 implementing a sorting algo- 
rithm. For legibility, we concentrate on the skeleton shown on the right, which 



int n , i , j , A [n] ; 




i=n; 




11: while (i>=0) { 


11: if (i>=0) j=0; 


j=0; 




12: while (j<=i-l) •[ 


12: if (i-j>=l) { 


if (A[j]>=A[j+l]) 


j=j+i; 


swap (A [j] ,A[j + l] ) ; 


goto 12; 


j=j+i; 


}■ else { 


} 


i=i-l; 


1 

•H 

II 

•H 


goto 11; 


} 


}■ 



Fig. 2. Sorting program and its skeleton. 



consists of the statements stl, st2, st3. 



11: 


if 


(i>=0) 


{ 


(i.j 


):=(i,0); 


goto 


12; } 


/* 


stl 




12: 


if 


(i-j>=l) 


{ 


(i.j 


) : = (i, j+1) ; 


goto 


12; } 


/* 


st2 




12: 


if 


(i-j<l) 


{ 


(i.j 


):=(i-l,j); 


goto 


11; } 


/* 


st3 


*/ 



We read, for example, the first program statement as: if the current program 
location is labeled by 11 and the “if” condition is satisfied then update the 
variables according to the update expressions and change the current label la- 
bel to 12. Note that the updates are performed simultaneously (“concurrent” 
assignments in [Dij76]). 

Each of the ‘simple’ programs below must be read as a one-line program. 



if 


(true) 


{ 


(1, j) 


= (Any, Any) ; 


goto 


12 


} 


/* 


al 




if 


(true) 


{ 


(1, j) 


= (Any, Any) ; 


goto 


11 


} 


/* 


a2 


*/ 


if 


(i>=0) 


{ 


(1, j) 


=(i-Pos,Any) ; 


goto 


11 


} 


/* 


a3 




if 


(i>=0) 


{ 


(1, j) 


=(i-Pos,Any) ; 


goto 


12 


} 


/* 


a4 




if 


(i-j>=l) 


{ 


(1, j) 


=(i-Nat , j+Pos) 


goto 


12 


} 


/* 


a5 


*/ 



Note the nondeterministic update expressions, e.g., after execution of i:=Any 
the value of variable i could by any integer, the update i : =i-Pos decrements 
the value of i by at least one. 

We notice that stl is approximated by al, st2 by a5 and st3 by a2. This 
means that every transition induced by execution of the statement stl can 
also be achieved executing a single step of al. In fact, every sequence of pro- 
gram statements is approximated by one of al, . . . , a5. We say that the set 
{al, . . . , a5} is a transition invariant in our terminology. 
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For example, every sequence of program statements that leads from 12 to 
12 is approximated by a4 if it passes through 11, and by a5 otherwise. The 
following table assigns to each ‘simple’ program the set of sequences of program 
statements that it approximates. All non-assigned sequences are not feasible. 



al 


stl(st2|st3stl)* 


a2 


(st2|st3stl)*st3 


a3 


stl(st2|st3stl)*st3 


a4 


(st2|st3stl)*st3stl(st2|st3stl)* 


a5 


st2+ 



According to the formal development in [PROS], the transition invariant 
above is ‘strong enough’ to prove termination, which means: each of its ‘sim- 
ple’ programs, viewed in isolation, is terminating. 

Termination is obvious for ‘simple’ programs that do not refer to a loop in the 
control flow graph, like the ‘simple’ programs al and a2. The ‘simple’ programs 
of the form 



In: if (cond) {updates; goto In; } 
translate to a while loop 

In: while (cond) { updates; }. 

The ‘simple’ programs that translate to a while loop are in fact simple while 
programs whose termination proofs we study in this paper. 

Next, we describe an application of the transition invariants method with 
termination test in Figure 1 as a subroutine for checking whether a transition 
invariant is strong enough. We prove termination of a program implementing 
singular value decomposition algorithm. 

4.2 Program with Unbounded Nondeterminism 

We consider the program shown in Figure 3. It has a nondeterministic choice 
at the location labeled by 1. The value of the variable y is chosen nondeter- 
ministically in the first branch. Termination proof for this program requires a 
lexicographic ranking function. The program translates to the statements stl 
and St 2: 

1: if (x>=0) { (x,y) ;=(x-l,Any) ; goto 1; } /* stl */ 

1: if (y>=0) { (x,y) ;=(x,y-l) ; goto 1; } /* st2 */ 

The transition invariant computed by our tool consists of the following ‘sim- 
ple’ programs. 

1: if (x>=0) { (x,y) :=(x-Pos,Any) ; goto 1; } /* al */ 

1: if (y>=0) { (x,y) :=(Any,y-Pos) ; goto 1; } /* a2 */ 





248 A. Podelski and A. Rybalchenko 



int X , y ; 




1: if (*) { 


if (x>=0) { X — ; read(y) ; } 


if (x>=0) { 

X — ; read(y) ; 

} 


8 


} else { 

if (y>=0) y— ; 

} 

goto 1: 


if (y>=0) y— ; 





Fig. 3. Program with unbounded nondeterminism. 



Both ‘simple’ programs, viewed in isolation, are terminating. Hence, the program 
with unbounded nondeterminism, shown in Figure 3 is terminating. 

4.3 Singular Value Decomposition Program 

We considered an algorithm for constructing the singular value decomposition 
(SVD) of a matrix. SVD is a set of techniques for dealing with sets of equations or 
matrices that are either singular or numerically very close to singular [PTVF92] . 
A matrix A is singular if it does not have a matrix inverse A~^ such that AA~^ = 
/, where I is the identity matrix. 

Singular value decomposition of the matrix A whose number of rows m is 
greater or equal to its number of columns n is of the form 

A = UWV^, 

where U is an m x n column-orthogonal matrix, TV is an n x n diagonal matrix 
with positive or zero elements (called singular values), and the transpose matrix 
of an n X n orthogonal matrix V. Orthogonality of the matrices U and V means 
that their columns are orthogonal, i.e., 

U^U = V = I. 

The SVD decomposition always exists, and is unique up to permutation of the 
columns of U, elements of W and columns of V, or taking linear combinations of 
any columns of U and V whose corresponding elements of W are exactly equal. 

SVD can be used in numerically difficult cases for solving sets of equations, 
constructing an orthogonal basis of a vector space, or for matrix approxima- 
tion [PTVF92], 

We proved termination of a program implementing the SVD algorithm based 
on a routine described in [GVL96] . The program was taken from [PTVF92] . It 
is written in C and contains 163 lines of code with 42 loops in the control-flow 
graph, nested up to 4 levels. 
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We used our transition invariant generator to compute a transition invariant 
for the SVD program. Proving the transition invariant to be strong enough 
required testing termination of 219 LASW programs. 

We applied our implementation of the algorithm on Figure 1, which was 
done in SICStus Prolog [LabOl] using the built-in constraint solver for linear 
arithmetic [Hol95]. Proving termination required 800 ms on a 2.6 GHz Xeon 
computer running Linux, which is in average 3.6 ms per each LASW program. 



5 Related Work 

The verification of termination and other liveness properties of programs requires 
the discovery of invariants as well as of ranking functions to prove the termina- 
tion of program loops. Here, we relate our work not to methods for the automated 
discovery of invariants (see e.g. [Kar76,CH78,BBM97]), but to the more closely 
related topic of methods for the automated synthesis of ranking functions, a 
topic that has received increasing attention in the last years [GCGL02,CS01, 
DGG00,Mes96,MN01,GS02,SG91] . 

As a first general remark, a major difference between our work and all the 
others lies in the fact that we obtain a completeness result. 

A heuristic-based approach for discovery of ranking functions is described 
in [DGGOO]. It inspects the program source code for ranking function candidates. 
This method restricted to programs where the ranking function is exhibited 
already in the source code. 

The algorithm in [GSOl] extracts a linear ranking function of an unnested 
program loop by manipulating polyhedral cones; these represent the transition 
relation of the loop and the loop invariant. Their approach depends on the 
strength of the invariant generator, which they call in a subroutine to propose 
bounded linear arithmetic expression. The algorithm requires exponential space 
in the worst case. A generalization of that algorithm described in [GS02] for 
programs with complex control structures detects linear ranking functions for 
strongly connected components in the control-flow graph of more general pro- 
grams. In both cases the algorithm is restricted to bounded nondeterminism. 
Moreover, it cannot handle loops with non-monotonic decrease, such as in while 
(x>=0) {x=x+l; x=x-l;}. 

The method for discovery of nonnegative linear combinations of bound ar- 
gument sizes for proving termination of logic programs in [SG91] relies on auto- 
matically inferred inter-argument constraints. The duality theory of linear pro- 
gramming is applied to discover combinations that decrease during top-down 
execution of recursive rules; the determined combinations are bounded from 
below since argument sizes are always positive. This method was applied for 
inferring termination of constraint logic programs [Mes96], and in systems for 
inferring termination of logic programs [MN01,GGGL02]. Garried over into the 
context of imperative program loops, the inference of inter-argument constraints 
corresponds to calls to the invariant generator, as in [GSOl]; the same restrictions 
as mentioned above apply. 
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6 Conclusion 

We have presented the to our knowledge first complete algorithm for the synthe- 
sis of linear ranking functions for a small but natural and well-motivated class 
of programs, namely, unnested program loops built using linear arithmetic ex- 
pressions (LASW programs) . The method is guaranteed to find a linear ranking 
function, and therefore to prove termination, if a linear ranking function exists. 
The existence of a linear ranking function for an LASW program is equivalent to 
the satisfiability of the system of linear inequalities derived from the program. 

The termination check for LASW programs is a subroutine in the auto- 
mated method for the verification of termination and other liveness properties 
of general-purpose programs via the computation of transition invariants [PROS] . 

We have implemented the proposed algorithm using an efficient implemen- 
tation of a solver for linear programming over rationals [Hol95] . We applied our 
implementation to prove termination of a singular value decomposition program, 
which required termination proofs for 219 LASW programs. This and other ex- 
periments indicate the practical potention of the algorithm. 

Considering future work, we would like to find a characterization of LASW 
programs that do always have linear ranking functions, i.e., for which our al- 
gorithm decides termination. Another direction of work is to handle unnested 
program loops built using expressions other than linear arithmetic. 
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Abstract. This paper shows how to achieve, under certain conditions, abstract- 
interpretation algorithms that enjoy the best possible precision for a given abstrac- 
tion. The key idea is a simple process of successive approximation that makes 
repeated calls to a decision procedure, and obtains the best abstract value for a set 
of concrete stores that are represented symbolically, using a logical formula. 



1 Introduction 

Abstract interpretation [6] is a well-established technique for automatically proving cer- 
tain program properties. In abstract interpretation, sets of program stores are represented 
in a conservative manner by abstract values. Each program statement is given an inter- 
pretation over abstract values that is conservative with respect to its interpretation over 
corresponding sets of concrete stores; that is, the result of “executing” a statement must 
be an abstract value that describes a superset of the concrete stores that actually arise. 
This methodology guarantees that the results of abstract interpretation overapproximate 
the sets of concrete stores that actually arise at each point in the program. 

In [7], it is shown that, under certain reasonable conditions, it is possible to give a 
specification of the most-precise abstract interpretation for a given abstract domain. For 
a Galois connection defined by abstraction function a and concretization function 7, 
the best abstract post operator for transition t, denoted by Post** [r] , can be expressed in 
terms of the concrete post operator for t, Post[r], as follows: 

Post**[r] = aoPost[r]o7. (1) 

This defines fhe limif of precision obfainable using a given absfraction. However, Eqn. ( 1 ) 
is non-consfructive; it does not provide an algorithm for finding or applying Post**[r]. 

Graf and Sai'di [11] showed that decision procedures can be used to generate best 
abstract transformers for abstract domains that are fixed, finite, Cartesian products of 
Boolean values. (The use of such domains is known as predicate abstraction', predicate 
abstraction is also used in SLAM [2] and other systems [8,12].) The work presented in this 
paper shows how some of the benefits enjoyed by applications that use the predicate- 
abstraction approach can also be enjoyed by applications that use abstract domains 
other than predicate-abstraction domains. In particular, this paper’s results apply to 
arbitrary finite-height abstract domains, not just to Cartesian products of Booleans. For 
example, it applies to the abstract domains used for constant propagation and common- 
subexpression elimination [14]. When applied to a predicate-abstraction domain, the 
method has the same worst-case complexity as the Graf-Saidi method. 

To understand where the difficulties lie, consider how they are addressed in predicate 
abstraction. In general, the result of applying 7 to an abstract value I is an infinite set 
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of concrete stores; Graf and Sai'di sidestep this difficulty by performing 7 symbolically, 
expressing the result of j{l) as a formula ip. They then introduce a function that, in 
effect, is the composition of a and Post [r] : it applies Post [r] to ip and maps the result 
back to the abstract domain. In other words, Eqn. (1) is recast using two functions that 
work at the symbolic level, 7 and aPost[r],' such that o;Post[T] 07 = o;oPost[T] 07 . 

To provide insight on what opportunities exist as we move from predicate-abstraction 
domains to the more general class of hnite-height lattices, we first address a simpler 
problem than aPost [r] , namely. 

How can a be implemented? That is, how can one identify the most-precise 
abstract value of a given abstract domain that overapproximates a set of concrete 
stores that are represented symbolically? 

We then employ the basic idea used in a to implement our own version of aPost [t] . 

The contributions of the paper can be summarized as follows: 

- The paper shows how some of the benehts enjoyed by predicate abstraction can 
be extended to arbitrary hnite-height abstract domains. In particular, we describe 
methods for each of the operations needed to carry out abstract interpretation. 

- With some logics, the result of applying Post[r] to a given set of concrete stores 
(represented symbolically) can also be expressed symbolically, as a formula (/>'. In 
this case, we can proceed by computing «(</>'). For other logics, however, 1 ^' cannot 
be expressed symbolically without passing to a more powerful logic. For instance, 

• If sets of concrete stores are represented with quantiher-free hrst-order logic, it 
may require quantihed hrst-order logic to express Post[r]. 

• If sets of concrete stores are represented with a decidable subset of hrst-order 
logic, it may require second-order logic to express Post[r]. 

In such situations, the procedure that we give to compute aPost [t] provides a way 
to compute the best transformer while staying within the original logic. 

The remainder of the paper is organized as follows: Sect. 2 motivates the work by 
presenting an a procedure for a specihc hnite-height lattice. Sect. 3 introduces terminol- 
ogy and notation. Sect. 4 presents the general treatment of a procedures for hnite-height 
lattices. Sect. 5 discusses symbolic techniques for implementing transfer functions (i.e., 
aPost [r]). Sect. 6 makes some additional observations about the work. Sect. 7 discusses 
related work. 

2 Motivating Examples 

This section presents several examples to motivate the work. The treatment here is at 
a semi-formal level; a more formal treatment is given in later sections. (This section 
assumes a certain amount of background on abstract interpretation; some readers may 
hnd it helpful to consult Sect. 3 before reading this section.) 

The example concerns a simple concrete domain: let Var denote the set of variables 
in the program being analyzed; the concrete domain is 2 

* We use the diacritic ^ on a symbol to indicate an operation that either produces or operates on 
a symbolic representation of a set of concrete stores. 
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Predicate Abstraction. A predicate-abstraction domain VA[B] is based on a set B of 
predicate names, each of which has an associated defining formula; B = {Bj = (pj \ 
1 < J < k}. Each value in VAIB] is a set of possibly negated symbols drawn from 
B, where each symbol Bj is either present in positive or negative form (but not both), 
or absent entirely. For instance, with B = {Bi = (pi, B 2 = p> 2 , B^ = 1 ^ 3 }, values in 
VA[B] include {-iBi, i? 2 , {Bi,B 2 }, {-'B 3 }, and 0. 

We will use a predicate-abstraction domain in which there is a Boolean predicate 
B = (x = c) for each x G Var and each distinct constant c that appears in the program. 
For instance, if the program is 



X := 4*y-|- 1 ^ 

the predicate-abstraction domain is based on the predicate set {Bi = {y = 1), 
B 2 = {y = 3), B^^A{y = 4), B 4 = (x = 1), B, ^ (x = 3), = (x = 4)}. 

Note that this domain does not provide an exact representation of the final state that 
arises, [x 1 — 13, y 1 — 3]. The best that can be done is to use the abstract value 
{-■Bi, i? 2 , “'^ 4 , ~'B 5 , -'Bq}, which provides limited information about the value 

of X. 

Our choice of predicate-abstraction domain 'PA[{Bi,B 2 ,B^,B 4 ,B 5 ,Bq}] was 
made solely for the sake of simplicity. With a different choice of predicates, we could 
have retained a greater or lesser amount of information about the value of x in the state 
after program (2); however, there would always he some program that gives rise to a 
state in which information is lost. 



The a Function for Predicate-Abstraction Domains. One of the virtues of the 
predicate-abstraction method is that it provides a procedure to obtain a most-precise 
abstract value, given (a specification of) a set of concrete stores as a logical formula ip 
[11]. We will call this procedure Spa; it relies on the aid of a decision procedure, and 
can be defined as follows: 

Spa(V') = {Bj \ Ip ^ (fj is valid} U {-•Bj \ ip => -•ipj is valid} (3) 

For instance, suppose that is the formula (y = 3 )A(x = 4* y-f 1), which captures 
the final state of program (2). For SpA((y = 3) A (x = 4* y-f 1)) to produce the answer 
{-■Bi, i? 2 , “'^ 3 , “’^ 4 , “’^ 5 ) “'Se}, the decision procedure must demonstrate that the 
following formulas are valid; 

(y = 3) A (x = 4 * y -f 1) =;> -i(y = 1) (y = 3) A (x = 4 * y -f 1) => -i(x = 1) 

(y = 3) A (x = 4 * y -f 1) => (y = 3) (y = 3) A (x = 4 * y -f 1) =;> -i(x = 3) 

(y = 3) A (x = 4 * y -f 1) => -i(y = 4) (y = 3) A (x = 4 * y -f 1) => -i(x = 4) 



Going Beyond Predicate Abstraction. We now show that the ability to implement the 
a function of a Galois connection between a concrete and abstract domain is not limited 
to predicate-abstraction domains. In particular, we will demonstrate this for the abstract 
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domain used in the constant-propagation problem: ( Var Z^)± - The abstract value 
_L represents 0; an abstract value such as [x 0, y i— >■ T, z i— >■ 0] represents all concrete 
stores in which program variables x and z are both mapped to 0.^ 

The procedure to implement a for the constant-propagation domain, which we call 
Scp> is actually an instance of a general procedure for implementing a functions that 
applies to a family of Galois connections. It is presented in Fig. 1 ; Scp is the instance of 
this procedure in which the return type L is { Var Z^)±, and “structure” in line [5] 
means “concrete store”. 



[1] 


L 


S (formula r/)) { 


[2] 




ans := _L 


[3] 




ip := ip 


[4] 




while {p is satisfiable) { 


[5] 




Select a structure S such 


[6] 




ans := ans U j3(S) 


[7] 




p := p A -i7(ans) 


[8] 




} 


[9] 




return ans 


[10] 


} 





Fig. 1. An algorithm to obtain, with the aid of a decision procedure, a most-precise abstract value 
that overapproximates a set of concrete stores. In Sect. 2, the return type L is ( Var — >■ Z^)±, and 
“structure” in line [5] means “concrete store”. 



As with procedure SpA, Scp is permitted to make calls to a decision procedure 
(see line [5] of Fig. 1). We make one assumption that goes beyond what is assumed in 
predicate abstraction, namely, we assume that the decision procedure is a satisfiability 
checker that is capable of returning a satisfying assignment, or, equivalently, that it is a 
validity checker that returns a counterexample. (In the latter case, the counterexample 
obtained by calling ProveValid(-n^) is a suitable satisfying assignment.) 

The other operations used in procedure Scp are LI, and 7: 



- The concrete and abstract domains are related by a Galois connection defined by a 
representation function (3 that maps a concrete store S G Var — Z to an abstract 
value P{S) G (Var —>■ Z^)±. For instance, /? maps the concrete store \x 1 — >■ 
13, y I— 3] to the abstract value [x i-> 13, y 3]. 

- U is the join operation in ( Vdr — >■ Z^)±. For instance, 

[x h-y 0, y I— >■ 43, z i— y 0] U [x h-y 0, y i-> 46, z i—y 0] = [x 1 — y 0, y 1 — y T, z h- y 0]. 



^ We write abstract values in Courier typeface (e.g., [x i~y 0, y 1 — y T, z i-y 0]), and concrete stores 
in Roman typeface (e.g., [a; 1 — y 0, y 1 — y 43, z 1 — y 0]). 
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- There is an operation 7 that maps an abstract value I to a formula j{l) such that I 
and 7(Z) represent the same set of concrete stores. For instance, we have 

7([x n- 0, y I— !> T, z 1-5- 0]) = (x = 0) A (z = 0). 

The resulting formula contains no term involving y because y i-A T does not place 
any restrictions on the value of y. 

Operation 7 permits the concretization of an abstract store to be represented 
symbolically, using a logical formula. This allows sets of concrete stores to be 
manipulated symbolically, via operations on formulas. 

To see how Scp works, consider the program 



z := 0 
X := y * z 



(4) 



and suppose that ip is the formula (z = 0) A (x = y * z), which captures the final state 
of program (4). The following sequence of operations would be performed during the 
invocation of acp{{z = 0) A {x = y * z)): 



Initialization: ans _L 

ip := {z = 0) A {x = y * z) 

Iteration 1: S := [x 1 — >■ 0, 1 / 1 — >■ 43, z 1 — >■ 0] // Some satisfying concrete store 

ans := _L U /3([x !->■ 0, 1 / !->■ 43, z !->■ 0]) 

= [x I— 0, y I— >■ 43, z 0] 

7 (ans) = (x = 0) A (y = 43) A (z = 0) 

ip := {z = 0) A {x = y * z) A ~‘((x = 0) A (y = 43) A (z = 0)) 

= (z = 0) A (x = y * z) A ((x 0) V (y / 43) V (z 0)) 

= (z = 0) A (x = y * z) A (y / 43) 

Iteration 2: S := [x 1 — >■ 0, y 1 — >■ 46, z 1 — >■ 0] // Some satisfying concrete store 

ans := [x I— 0, y I— >■ 43, z 0] U /3([x 1 — >■ 0, y 1 — >■ 46, z 1 — >■ 0]) 

= [x I— 0, y I— >■ 43, z H-)- 0] U [x !“>■ 0, y I— 46, z 1 — > 0] 

= [x I— >■ 0, y I— >■ T, z I— >■ 0] 

7 (ans) = (x = 0) A (z = 0) 

:= (z = 0) A (x = y * z) A (y / 43) A ((x 7 ^ 0) V (z 0)) 

= ff 



Iteration 3: p is unsatisfiable 

Return value: [x i~> 0, y 1— >■ T, z 1— > 0] 



At this point the loop terminates, and Scp returns the abstract value 
[x i-> 0, y I— >■ T, z h-y 0]. In effect, Scp has automatically discovered that in the ab- 
stract world the best treatment of the multiplication operator is for it to be non-strict in 
T. That is, 0 is a multiplicative annihilator that supersedes T: 0 = T * 0. 

In general, a('ip) carries out a process of successive approximation, making repeated 
calls to a decision procedure. Initially, p is set to ip and ans is set to _L . On each iteration of 
the loop in a, the value of ans becomes a better approximation of the desired answer, and 
the value of p describes a smaller set of concrete stores, namely, those stores described 
by Ip that are not, as yet, covered by ans. For instance, at line [7] of Fig. 1 during Iteration 
1 of the second example of Scp(^), ans has the value [x h-y 0, y 43, z 0], and the 
update to p, p := p A -■7(ans), sets p to {z = 0) A {x = y * z) A {y ^ 43). Thus, p 
describes exactly the stores that are described by ip, but are not, as yet, covered by ans. 
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Each time around the loop, a selects a concrete store S such that S' |= i^. Then a uses 
(3 and U to perform what can be viewed as a “generalization” operation: (3 converts con- 
crete store S into an abstract store; the current value of ans is augmented with (3{S) using 
U . For instance, at line [6] of Fig. 1 during Iteration 2 of the second example of Scp (V') > 
ans’s value is changed from [x i-> 0, y i~y 43, z h- !■ 0] to [x r-i- 0, y i— s- 43, z h- s- 0] U 
(3{\x I— >■ 0, y I— 46, z i— 0]) = [x i~> 0,y i— T,z i-> 0]. In other words, the gener- 
alization from two possible values for y, 43 and 46, is T, which indicates that y may 
not be a constant at the end of the program. 

Fig. 2 presents a sequence of diagrams that illustrate schematically algorithm a from 
Fig. 1. 

3 Terminology and Notation 

For us, concrete stores are logical structures. The advantage of adopting this outlook is 
that it allows potentially infinite sets of concrete stores to be represented using formulas. 

Definition 1. Let P = {pi, . . . ,Pm} be a finite set of predicate symbols, each with a 
fixed arity; let Pi denote the set of predicate symbols with arity i. Let C = {ci, . . . , c„} 
be a finite set of constant symbols. Let F = {/i, / 2 , . . . , fp} be a finite set of function 
symbols each with a fixed arity; let Fi denote the set of function symbols with arity i. 
A logical strncture over vocabulary V = {P, C, F) is a tuple S = {U, tp, tc, tf) in 
which 

— U is a (possibly infinite) set of individuals. 

— ip is the interpretation of predicate symbols, i.e.,for every predicate symbol p G P;, 
tp{p) Q U’' denotes the set of i-tuples for which p holds. 

— i^ is the interpretation of constant symbols, i.e., for every constant symbol c € C, 
ic(c) G Lf denotes the individual associated with c. 

— if is the interpretation of function symbols, i.e., for every function symbol f G Fi, 
tf{f) : U’' ^ U maps i-tuples into an individual. 

Typically, some subset of the predicate symbols, constant symbols, and function symbols 
have an interpretation that is fixed in advance; this defines a family of intended models. 
We denote the (infinite) set of structures over V, where the interpretations of I FV are 
fixed in advance, by ConcreteStruct[V , I]. 

Example 1. In Sect. 2, we considered concrete stores to be members of Var Z. 
This is a common way to define concrete stores; however, in the remainder of the pa- 
per concrete stores are identified with logical structures. A store in which program 
variables are bound to integer values is a logical structure {Z,%,Lvan^) over vo- 
cabulary {IntPreds, Var U IntConsts, IntFuncs), where ivar is a mapping of pro- 
gram variables to integers, and the symbols in IntPreds = {<, <, =, f ,>,>,■■ ■ }, 
IntConsts = {0,— 1,1,— 2,2, ...}, and IntFuncs = {-f, have their 
usual meanings. For instance, an example concrete store for a program in which 
Var = {x, y, z} is 



(Z,0, [x H> 0,y 2, z H> O],0). 



( 5 ) 



Henceforth, we abbreviate a store such as (5) by tc = [x i-G- 0, y i— >■ 2, z i— 0]. 




258 



T. Reps, M. Sagiv, and G. Yorsh 




Fig. 2 . Schematicdiagrams that illustrate the process carried out by algorithm from Fig. 

Si, and ansi denote the values of (p, S, and ans during the iteration, (a) Initially, (pi is set to ip 
and ansi is set to _L; S': is a structure such that Si \= </9i. (b) ans2 is setto ansi U/ 3 (Si) = / 3 (Si); 
P2 is set to pi A -i7(ans2); S2 is a structure such that S2 |= P2- Note that S2 belongs to 
[‘P2] = \pi A -i7(ans2)|. (c) ansa is set to ans2 LI /?(S2); ps is set to p2 A -'7(ansa); S3 is a 
structure such that S3 |= 7)3. (d) ansi is set to ansa LI /^(Ss); 714 is set to 73 A -■7(ans4); S4 is 
a structure such that S4 |= 74. (e) anss is set to ans4 LI 75 is set to 74 A -'^(euiss). In 

the case portrayed here, the loop terminates at this point because 75 = ff. The desired answer is 
held in anss. (f) a{'(p) obtains the most-precise abstract value euis that overapproximates |t/)]. 



To manipulate sets of structures symbolically, we use formulas of first-order logic 
with equality. If S' is a logical structure and 7; is a closed formula, the notation S |= 7? 
means that S satisfies p according to the standard Tarskian semantics for first-order 
logic (e.g., see [ 10 ]). We use I7)] to denote the set of concrete structures that satisfy p\ 
It^I = {S I S G ConcreteStruct[V ,I],S \= p}. 
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Example 2. 

[(x = 0) A (2 = 0)] 



tc = [x !->■ 0, y !->■ 0 , 2 !->■ 0], ic = [x 0, y !->■ 1, 2 !->■ 0], 
tc = [x !->■ 0, y !->■ 2, 2 !->■ 0], . . . 



Definition 2. A complete join semilattice L = {L, C, |J, _L) is a partially ordered set 
with partial order C, such that for every subset X of L, L contains a least upper bound 
(or join), denoted by [J X. 

The minimal element _L G L w |J 0. We use xU y as a shorthand for lJ{x, y}. We 
write X \Z y when x Qy and x y. 

The powerset of concrete stores is a complete join semilattice, 

where (i) X HY iff X CY, (ii) _L = 0, and (iii) U = U- 

Definition 3. Let L = (L, C, |J, _L) a complete join semilattice. A strictly increasing 

chain in L is a sequence of values l\,l 2 , . ■ ■ , such that li \Z h+i- We say that L has 
finite height if every strictly increasing chain is finite. 

We now define an abstract domain by means of a representation function [18]. 

Definition 4. Given a complete join semilattice L = (L, C, |J, _L) and a representation 
function/?: ConcreteStruct[V , I] — >■ L such that for all S G ConcreteStruct[V , I] 
f3{S) Y, a Galois connection defined by extending /? 

pointwise, i.e.,forXS C ConcreteStruct[V , I] andl G L, 

a{XS) = I I fi{S) y{l) = {S' | S' G ConcreteStruct[V , I], fi{S) C /} 

sexs 



It is straightforward to show that this defines a Galois connection, i.e., (i) a and 7 are 
monotonic, (ii) a distributes over U, (iii) XS C 7 (a(XS)), and (iv) a{y{l)) Q 1. 

We say that I overapproximates a set of concrete stores XS if ^{l) XS. It is 
straightforward to show that a{XS) is the most-precise (i.e., least) abstract value that 
overapproximates XS. 



Example 3. In our examples, the abstract domain will continue to be the one introduced 
in Sect. 2, namely, ( Var — >■ Z^)±. As we saw in Sect. 2, (3 maps a concrete store like 
ic= [x I— >■ 0, y I— 2, 2 i-G- 0] to an abstract value [x 0, y 1 — !> 2, z 0]. Thus, 

/ f ic = [x 0, y 0, 2 0], 1 \ ^ (3{lc = [x h> 0, y h> 0, 2 h> 0]) 

^ V 1 tc = [x 0,y 1 -^ 2, z 1 -^ 0] j J U (3{ic = [x 1 — 0, y 1 — >■ 2, 2 1 — >■ 0]) 

_ [x 0, y I— >■ 0, z H- s- 0] 

U [x i~> 0, y I— >• 2, z iH- 0] 

= [x 0, y !->• T, z hA 0] . 



Suppose that abstract value I is [x n- 0, y 1 — >■ T, z i-s- 0]. Because y 1 — >■ T does not 
place any restrictions on the value of y, we have 



[x !->■ 0 , y !->■ 0 , 2 1-7 0 ] , tc 

[x I— >■ 0 , y I— 2, 2 I— 0 ] , . . 



0, y 1-7 1, 2 !->■ 0], 



7(0 
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4 Symbolic Implementation of the a. Function 

This section presents a general framework for implementing a functions of Galois con- 
nections using procedure a from Fig. 1. S'(V') finds the most-precise abstract value in 
a finite-height lattice, given a specification of a set of concrete stores as a logical for- 
mula il). a represents sets of concrete stores symbolically, using formulas, and invokes 
a decision procedure on each iteration. 

The assumptions of the framework are rather minimal; 

- The concrete domain is the power set of ConcreteStruct[V , I]. 

- The concrete and abstract domains are related by a Galois connection defined by 
a representation function (3 that maps a structure S G ConcreteStruct[V , I] to an 
abstract value (3{S). 

- It is possible to take the join of two abstract values. 

- There is an operation 7 that maps an abstract value I to a formula j{l) such that 



[ 7(01 = 7(0- 



(6) 



Operation 7 permits the concretization of an abstract value to be represented symbol- 
ically, using a logical formula, which allows sets of concrete stores to be manipulated 
symbolically, via operations on formulas. (In this paper, we use first-order logic; in 
general, however, other logics could be used.) 

Example 4. As we saw in Sect. 2, because y 1— T does not place any restrictions on 
the value of y, we have 7([x i-> 0, y 1— >■ T, z 1— s- 0]) = (x = 0) A (z = 0). From Exs. 2 
and 3, we know that 



[(x = 0) A (2 = 0)] 



tc = [x !->■ 0, y !->■ 0, 2 !->■ 0], tc = [x 0, 2/ !->■ 1, 2 !->■ 0], 

Lc = [x !->■ 0, 2/ !->■ 2, 2 !->■ 0] , ... 



= 7 ([x i-> 0,y 1 -s- T,z 0]), 

and thus Eqn. (6) is satisfied. For I G ( Var —>■ j(l) is defined as follows: 



7(0 



ff if i = _L 

/y (v = I (v)) otherwise 

V G Var, 
l(v) ^ T 



Specification of Alpha. Procedure a is to implement a, given a specification of a set 
of concrete stores as a logical formula ip. Therefore, a must have the property that for 

all Ip, a{ip) = ad'i/'l). 

Note that a logical formula ip represents the set of concrete stores If/;]; thus, a(|f/;]) 
(and hence a{ip), as well) is the most-precise abstract value that overapproximates the 
set of concrete stores represented symbolically by pj. 

Implementation of Alpha. Procedure a is given in Fig. 1 . 

Example 5. A trace of a call on a for the constant-propagation domain ( Var -2^)± 
was presented in Sect. 2. In generalizing the idea from Sect. 2, concrete stores have been 
identified with logical structures, so instead of writing, e.g., S := [x 1 —^ 0,y 1 —^ 43, z i-A- 
0], we would now write S' := ic = '-A 0, 43, z 1— >■ 0]. 
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Theorem 1. Suppose that the abstract domain has finite height of at most h. Given input 
if, a(if) has the following properties: 

(i) The loop on lines [ 4 ]-[8 ] in procedure a is executed at most h times. 

(ii) a{if) = q;(|-0]) (i.e., a(ip) computes the most-precise abstract value that overap- 
proximates the set of concrete stores represented symbolically by if). 



5 Symbolic Implementation of Transfer Functions 

5.1 Transfer Functions for Statements 

If Q is a set of predicate, constant, or function symbols, let Q' denote the same set of 
symbols, but with a ' attached to each symbol (i.e., q G Q iff q' G Q'). 

The interpretation of statements involves the specification of transition relations 
using formulas. Such formulas will be over a “double vocabulary” V G V = 
(P U P', C U C", F’ U F'), where unprimed symbols will be referred to as present-state 
symbols, and primed symbols as next-state symbols.^ The satisfaction relation for a 
two-vocabulary formula r will be written as {S, S') \= r, where S and S' are structures 
over vocabularies V = (P, C, P) and V = (P', C , F'), respectively; {S, S') is called 
a two-vocabulary structure. 

Example 6. The formula that expresses the semantics of an assignment x := y * z with 
respect to stores over vocabulary {IntPreds, Var U Var' U IntC ousts , IntFuncs) , de- 
noted by Tx:=y*z, can be specified as Ty-.-y^z = {x' = y * z) /\ {y' = y) A (P = z). 

For parallel form, we will also assume that we have two isomorphic abstract domains, 
L and L', and associated variants of (3 and 7 

f3: ConcreteStruct[V , I] — >■ L f)' : ConcreteStruct[V' , I] — >■ L' 

7 : P — >■ Formula[V] y' : L' ^ Formula[V'] 

For the constant-propagation domain, this just means that a next-state abstract value 
produced by one transition, e.g., [x' 1 — 0, y' 1 — T, z' 1 — >■ 0] G L', can be identihed as 
the present-state abstract value [x 1 — 0, y 1 — >■ T, z 1 — >■ 0] G L for the next transition."^ 

^ For economy of notation, we will not duplicate the symbols I G V whose interpretation is 
fixed in advance. 

Alternatively, we could have used a single abstract domain, L, and the definitions 

f: ConcreteStruct[V , I] ^ L f' : ConcreteStruct[V' , I] ^ L 

7 : 1/ — >■ Formula[V] 7 " : L — >■ Formula[V'] 

The motivation for using two abstract domains is to eliminate a possible source of confusion in 
the examples. By using separate abstract domains L and L', primed symbols always distinguish 
next-state abstract values from present-state ones. 
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Specification. Given a formula r for a statement’s transition relation, the result of 
applying r to a set of concrete stores XS is 

Post[r](XS') = {S' I exists S G XS such that (S', S') ^ r}. 

(Note that this is a set of structures over vocabulary V'.) aPost[T](() is to return the 
most-precise abstract value in L' that overapproximates Post[r]( 7 (()). 



Implementation. aPost[r](/) can be computed by the procedure presented in Fig. 3. 
After ip is initialized to 7 ( 1 ) A r in line [3], aPost[r] operates very much like a, except 
that only abstractions of the S' structures are accumulated in variable cuis ’ (see lines [5] 
and [6]). On each iteration of the loop in o;Post[T], the value of ans ’ becomes a better 
approximation of the desired answer, and the value of p describes a smaller set of 
concrete stores, namely, those PUP' stores that are described by j{l) A r, but whose 
range (i.e., projection on the next-state symbols) is not, as yet, covered by ans ’ . 



[1] L' aPost (two-vocabulary formula r over P U P' , L 1 ) { 

[2] ans’ := iJ 

[ 3 ] P ■= 7(0 ^ T 

[4] while (ip is satisfiable) { 

[5] Select a two-vocabulary structure {S,S') s.t. {S,S') \= p 

[6] ans’ := ans’ U S' {S') 

[7] (p := (p A -i7'(ans’) 

[ 8 ] } 

[9] return ans ’ 

[ 10 ] } 



Fig. 3. An algorithm that implements aPost[r](l). 



Example 7. Suppose that I = [xi— >-T,yi— >-T,zi->-0], and the statement to be inter- 
preted is X := y * z. Then j{l) is the formula {z = 0), and Tx_y*z is the formula 
{x' = y * z) A {y' = y) A {z' = z). Fig. 4 shows why we have 

aPost[Tx:=y«z]([x i-A T, y !->• T, z rA 0]) = [x' i-A- 0, y' !->■ T, z' i-A 0]. 



Theorem 2. Suppose that the abstract domain has finite height of at most h. Given 
inputs T and I, aPost[r](() has the following properties: 

(i) The loop on lines [4]-[8] in procedure aPost[r] (() is executed at most h times. 

(ii) aPost[r](() = a(Post[r]( 7 (Z))) (i.e., o;Post[T](() computes the most-precise ab- 
stract value in L' that overapproximates Post[r]( 7 (())J. 

The operator Pre[r] can be implemented using a procedure that is dual to Fig. 3. 
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Initialization: ans' := ±' 

:= (z = 0) A (x' = 1 / * z) A {y' = y) A {z' = z) 

3; H-f 5, 2 / 17, 2 1-4 0 

Iteration 1: {S,S):=tc= , , , // Some satisfying structure 

X h4 0, y f“4 17, z ^-4 0 

ans' := [x' i-4 0, y' i-4 17, z' h4 O] 

7 ' (ans') = (x' = 0) A (y' = 17) A {z‘ = 0) 

:= (z = 0) A (x' = y* z) A (y' = y) A ( 2 ' = 2 ) 

A {(x' 0) V (y' 1 7) V { 2 ' ^ 0)) 

= (2 = 0) A (x' = y * 2 ) A (?/ = y) A ( 2 ' = 2 ) A ( 1 / / 17) 

X H4 12, y H4 99, 2 H4 0 

Iteration 2: (5. S ) := ic = , , , // Some satisfying structure 

X 1-4 0, y r4 99, 2 1-4 0 

ans' := [x' i-4 0, y' r4 17, z' h4 0] U [x' h 4 0, y' i-4 99, z' (-4 0] 

= [x' 1-4 0, y' 1-4 T, z' (-4 0] 

7 ' (ans') = (x' = 0) A ( 2 ' = 0) 

^ := (z = 0) A (x' = y * 2 ) A (y' = y) A ( 2 ' = 2 ) A (y' 17) 

A ((x' # 0) V ( 2 ' # 0)) 

= IT 

Iteration 3: ip is unsatisfiable 

Return value: [x' 1-4 0, y' 1-4 T, z' 1-4 0] 



Fig. 4 . Operations performed during a call aPost[rx,=y*z ]([x 1-4 T,y 1-4 T,z h 4 0]). 



5.2 Transfer Functions for Conditions 

Specification. The interpretation of a condition (f with respect to a given abstract value I 
must “pass through” all structures that are both represented by I and satisfy ip, i.e., those 
in 7 (?) n |(^]. Thus, the most-precise approximation to the interpretation of condition 
ip, denoted by Assume** [(/?](!), is defined by 

Assume** [(/?](^) = fl |(/?]). 

Implementation. Assume** [(/?](!) can be computed by the following method: 

Assume** [(/?](!) = A p). 

Example 8. 

Assume**[(y < 2)]([x h 4 0 , y 1-4 2 , z 1-4 7 ]) = S((x = 0 ) A (y = 2 ) A (2 = 7 ) A (y < 2)) 

= [x 1-4 0 , y h -4 2 , z r 4 7] 

Assume**[(y > 2)]([x 1-4 0 , y 1-4 2 , z 1-4 7 ]) = a((x = 0 ) A (y = 2 ) A (2 = 7 ) A (y > 2)) 

= _L 

Assume** [(y < 2)]([x h 4 0 , y 1-4 T, z 1-4 7 ]) = a((x = 0 ) A (2 = 7 ) A (y < 2)) 

= [x 1-4 0 , y h -4 T, z 1-4 7 ] 

Assume** [(y = 2)]([x i -4 0 , y 1-4 T, z e 4 7 ]) = a((x = 0 ) A (2 = 7 ) A (y = 2)) 

= [x h 4 0 , y 1-4 7 , z H 4 - 7] 
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6 Discussion 



This paper shows how the most-precise versions of the basic operations needed to create 
an abstract interpreter are, under certain conditions, implementable. These techniques 
use the idea of considering a first-order formula (/? as a device for describing (or accept- 
ing) a set of concrete structures, namely, the set of structures that satisfy (f. Not every 
subset of concrete structures can be described by a first-order formula; however, it is 
straightforward to generalize the approach to other types of logics, which can be consid- 
ered as alternative structure-description formalisms (possibly more powerful, possibly 
less powerful). For the basic approach to carry over, all that is required is that a decision 
procedure exist for the logic. 

Automatic theorem provers — such as MACE [16], SEM [20], and Einder [19] — 
can be used to implement the procedures presented in this paper because they return 
counterexamples to validity: a counterexample to the validity of -up is a structure that 
satisfies Lp. Such tools also exist for logics other than first-order logic; for example, 
MONA [15] can generate counterexamples for formulas in weak monadic second-order 
logic. 

Some tools, such as Simplify [9] and SVC [1], provide counterexamples in symbolic 
form, i.e., as a formula. The formula represents a set of counterexamples; any structure 
that satisfies the formula is a counterexample to the query. For example, if is a; > y at 
line [5] of Fig. 1, the value returned would be the formula {x > y) itself, rather than a 
particular satisfying structure, such as [x i-)- 7, j/ 1 — >■ 3] . This presents an obstacle because 
at line [6] (3 requires an argument that is a single structure. In the case of quantifier-free 
first-order logic with linear arithmetic, such a structure can be obtained by feeding the 
counterexample formula to a solver for mixed-integer programming, such as CPLEX 

[13]. 



int X, y, z 
Bool Bl, B2 
y := 3 

X := 4 ♦ y 1 

read(z) 

Bl := z < 29 
B2 := z < 27 
if Bl then y := 5 
if B2 then x := y + 8 

Fig. 5. A program with corre- 
lated branches. 



With the aid of Simplify, we have verified the 
constant-propagation examples in this paper, as well as 
examples that combine the constant-propagation domain 
with a predicate-abstraction domain. This is an addi- 
tional benefit of the approach: it can be used to gener- 
ate the best transformer for combined domains, such as 
reduced cardinal product and those created using other 
domain constructors [7]. For example, the best trans- 
former for the combined constant-propagation/predicate- 
abstraction domain determines that the variable x must 
be 13 at the end of the program given in Fig. 5. 



7 Related Work 



This paper is most closely related to past work on predicate abstraction, which also 
uses decision procedures to implement most-precise versions of the basic abstract- 
interpretation operations. Predicate abstraction only applies to a family of finite-height 
abstract domains that are finite Cartesian products of Boolean values; our results gen- 
eralize these ideas to a broader setting. In particular, our work shows that when a small 
number of conditions are met, most of the benefits that predicate-abstraction domains 
enjoy can also be enjoyed in arbitrary abstract domains of finite height, and possibly 
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infinite cardinality. However, procedure a of Fig. 1 uses an approach that is fundamen- 
tally different from the one used in predicate abstraction. Although both approaches use 
multiple calls on a decision procedure to pass from the space of formulas to the domain 
of abstract values, Spa goes directly from a formula to an abstract value, whereas a of 
Fig. 1 makes use of the domain of concrete values in a critical way: each time around 
the loop, S selects a concrete value S such that S \= \ a uses (3 and U to generalize 

from concrete value S to an abstract value. 

Procedure S is also related to an algorithm used in machine learning, called Find-S 
[17, Section 2.4]. In machine-learning terminology, both algorithms search a space of 
“hypotheses” to find the most-specific hypothesis that is consistent with the positive 
training examples of the “concept” to be learned. Find-S receives a sequence of training 
examples, and generalizes its current hypothesis each time it is presented with a positive 
training example that falls outside its current hypothesis. The problem settings for the 
two algorithms are slightly different: Find-S receives a sequence of positive and negative 
examples of the concept, a already starts with a precise statement of the concept in hand, 
namely, the formula ip', on each iteration, the decision procedure is used to generate the 
next (positive) training example. 

We have sometimes been asked “How do your techniques compare with predicate 
abstraction augmented with an iterative-refinement scheme that generates new predi- 
cates, as in SLAM [3] or BLAST [12]?”. We do not have a complete answer to this 
question; however, a few observations can be made: 

- Our results extend ideas employed in the setting of predicate abstraction to a more 
general setting. 

- For the simple examples used for illustrative purposes in this paper, iterative re- 
finement would obtain suitable predicates with appropriate constant values in one 
iteration. Our techniques achieve the desired precision using roughly the same log- 
ical machinery (i.e., the availability of a decision procedure), but do not rely on 
heuristics-based machinery for changing the abstract domain in use. 

- This paper studies the problem “How can one obtain most-precise results for a given 
abstract domain?”. Iterative refinement addresses a different problem: “How can one 
go about improving an abstract domain?” These are orthogonal questions. 

The question of how to go about improving an abstract domain has not yet been 
studied for abstract domains as rich as the ones in which our techniques can be 
applied. This is the subject of future work, and thus something about which one 
can only speculate. However, we have observed that our approach does provide 
a fundamental primitive for mapping values from one abstract domain to another: 
suppose that Li and L 2 are two different abstract domains that meet the conditions of 
the framework; given G Li, the most-precise value (2 G L2 that overapproximates 
7i((i) is obtained by (2 = a2(7i((i)). 

The domain-changing primitive opens up several possibilities for future work. For 
example, counterexample-guided abstraction-refinement strategies [5,4] identify the 
shortest invalid prefix of a spurious counterexample trace, and then refine the abstract 
domain to eliminate invalid transitions out of the last valid abstract state of the prefix. 
The domain-changing primitive appears to provide a systematic way to salvage 
information from the counterexample trace: for instance, it can be invoked to convert 
the last valid abstract state of the prefix into an appropriate abstract state in the refined 
abstract domain. Moreover, it yields the most-precise value that any conservative 
salvaging operation is allowed to produce. 
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In summary, because our results enable a better separation of concerns between the 
issue of how to obtain most-precise results for a given abstract domain and that of how 
to improve an abstract domain, they contribute to a better understanding of abstraction 
and symbolic approaches to abstract interpretation. 
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Abstract. Predicate abstraction provides a powerful tool for verifying properties 
of infinite-state systems using a combination of a decision procedure for a subset of 
first-order logic and symbolic methods originally developed for finite-state model 
checking. We consider models where the system state contains mutable function 
and predicate state variables. Such a model can describe systems containing ar- 
bitrarily large memories, buffers, and arrays of identical processes. We describe 
a form of predicate abstraction that constructs a formula over a set of universally 
quantified variables to describe invariant properties of the function state variables. 
We provide a formal justification of the soundness of our approach and describe 
how it has been used to verify several hardware and software designs, including a 
directory-based cache coherence protocol with unbounded FIFO channels. 



1 Introduction 



Graf and Sai'di introduced predicate abstraction [10] as a means of automatically de- 
termining invariant properties of infinite-state systems. With this approach, the user 
provides a set of k Boolean formulas describing possible properties of the system state. 
These predicates are used to generate a finite state abstraction (containing at most 2^ 
states) of the system. By performing a reachability analysis of this finite-state model, 
a predicate abstraction tool can generate the strongest possible invariant for the system 
expressible in terms of this set of predicates. Prior implementations of predicate abstrac- 
tion [10,7,1,8] required making a large number of calls to a theorem prover or first-order 
decision procedure and hence could only be applied to cases where the number of pred- 
icates was small. More recently, we have shown that both BDD and SAT-based Boolean 
methods can be applied to perform the analysis efficiently [12]. 

In most formulations of predicate abstraction, the predicates contain no free variables; 
they evaluate to true or false for each system state. The abstraction function a has a 
simple form, mapping each concrete system state to a single abstract state based on the 
effect of evaluating the k predicates. The task of predicate abstraction is to construct a 
formula tp* consisting of some Boolean combination of the predicates such that 'ip*{s) 
holds for every reachable system state s. 

* This research was supported in part by the Semiconductor Research Corporation, Contract 
RID 1029. 
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To verify systems containing unbounded resources, such as buffers and memories of 
arbitrary size and systems with arbitrary number of identical, concurrent processes, the 
system model must support state variables that are functions or predicates [11,4]. For 
example, a memory can be represented as a function mapping an address to the data 
stored at an address, while a buffer can be represented as a function mapping an integer 
index to the value stored at the specified buffer position. The state elements of a set of 
identical processes can be modeled as functions mapping an integer process identifier to 
the state element for the specified process. In many systems, this capability is restricted to 
arrays that can be altered only by writing to a single location [7,1 1]. Our verifier allows 
a more general form of mufable function, where the updating operation is expressed 
using lambda notation. 

In verifying systems with function state variables, we require quantified predicates to 
describe global properties of state variables, such as “At most one process is in its critical 
section,” as expressed by the formula Vi, j : crit(i) A crit(j) ^ i = j. Conventional 
predicate abstraction restricts the scope of a quantifier to within an individual predicate. 
System invariants often involve complex formulas with widely scoped quantifiers. The 
scoping restriction implies that these invariants cannot be divided into small, simple 
predicates. This puts a heavy burden on the user to supply predicates that encode in- 
tricate sets of properties about the system. Recent work attempts to discover quantified 
predicates automatically [7], but it has not been successful for many of the systems that 
we consider. 

In this paper we present an extension of predicate abstraction in which the user supplies 
predicates that include free variables from a set of index variables X. The predicate 
abstraction engine constructs a formula ij}* consisting of a Boolean combination of 
these predicates, such that the formula \/X-tp*{s) holds for every reachable system state 
s. With this method, the predicates can be very simple, with the predicate abstraction 
tool constructing complex, quantified invariant formulas. For example, the property that 
at most one process can be in its critical section could be derived by supplying predicates 
crit(i), crit(j), and i = j, where i and j are the index symbols. Encoding these 
predicates in the abstract system with Boolean variables ci, cj, and eij, respectively, 
we can verify this property by using predicate abstraction to prove that ci A c j => eij 
holds for every reachable state of the abstract system. 

Flanagan and Qadeer use a method similar to ours [8] for constructing universally quan- 
tified loop invariants for sequential software, and we briefly described our method in an 
earlier paper [12]. Our contribution in this paper is to describe the method more carefully, 
explore its properties, and to provide a formal argument for its soundness. The key idea 
of our approach is to formulate the abstraction function a to map a concrete system state 
s to the set of all possible valuations of the predicates, considering the set of possible 
values for the index variables X. The resulting abstract system is unusual; it is not char- 
acterized by a state transition relation and hence cannot be viewed as a state transition 
system. Nonetheless, it provides an abstract interpretation of the concrete system [6] 
that can be used to find invariant system properties. 

Assuming a decision procedure that can determine the satisfiability of a formula with 
universal quantifiers, we can prove fhe following completeness result for our formulation: 
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Predicate abstraction can prove any property that can be proved by induction on the state 
sequence using an induction hypothesis expressed as a universally quantified formula 
over the given set of predicates. For many modeling logics, this decision problem is 
undecidable. By using quantifier instantiation, we can implement a sound, but incomplete 
verifier. 

As an extension, we show that it is easy to incorporate axioms into the system, prop- 
erties that must hold universally for every system state. Axioms can be viewed simply 
as quantified predicates that must evaluate to true on every step. For brevity, this paper 
only sketches the main proofs. We conclude the paper by describing our use of predicate 
abstraction to verify several hardware and software systems, including a directory-based 
cache coherence protocol devised by Steven German [9]. We believe we are the first to 
verify the protocol for a system with an unbounded number of clients, each communi- 
cating via unbounded FIFO channels. 



2 Preliminaries 



We assume the concrete system is defined in terms of some decidable subset of first-order 
logic. Our implementation is based on the CLU logic [4], supporting expressions con- 
taining uninterpreted functions and predicates, equality and ordering tests, and addition 
by integer constants, but the ideas of this paper do not depend on the specific model- 
ing formalism. For discussion, we assume that the logic supports Booleans, integers, 
functions mapping integers to integers, and predicates mapping integers to Booleans. 



2.1 Notation 

Rather than using the common indexed vector notation to represent collections of values 
(e.g., V = {vi,V2, ■ ■ ■ , Vn)), we use a named set notation. That is, for a set of symbols 
A, we let V indicate a set consisting of a value Ux for each x G .4. 

For a set of symbols A, let 04 denote an interpretation of these symbols, assigning to 
each symbol x G .4 a value 174 (x) of the appropriate type (Boolean, integer, function, 
or predicate). Let 27^ denote the set of all interpretations a _4 over the symbol set A. 

For interpretations and ctb over disjoint symbol sets A and B, let (j_a ■ cr^ denote an 
interpretation assigning either ct_4(x) or ctb(x) to each symbol x G .4 U according to 
whether x G .4 or x G yB. 

For symbol set A, let E{A) denote the set of all expressions in the logic over A. For any 
expression e G E{A) and interpretation G 27^, let the valuation of e with respect 
to a_4, denoted (e)^^ be the (Boolean, integer, function, or predicate) value obtained by 
evaluating e when each symbol x G .4 is replaced by its interpretation ct 4 (x). 

Let V be a named set over symbols A, consisting of expressions over symbol set B. 
That is, Ux G E{B) for each x G .4. Given an interpretation as of the symbols in B, 
evaluating the expressions in v defines an interpretation of the symbols in A, which we 
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denote (v)^^. That is, (v)^^ is an interpretation 04 such that ct_ 4 (x) = (ux)^^ for each 

X € A. 

A substitution tt for a set of symbols ^ is a named set of expressions over some set of 
symbols B (with no restriction on the relation between A and B.) That is, for each x G A, 
there is an expression tTx G E{B). For an expression e G E{A U C), we let e 
denote the expression e' G E{B U C) resulting when we (simultaneously) replace each 
occurrence of every symbol x G A with the expression tTx- 



2.2 System Model 

We model the system as having a number of state elements, where each state element 
may be a Boolean or integer value, or a function or predicate. We use symbolic names to 
represent the different state elements giving the set of state symbols V. We introduce a set 
of initial state symbols J and a set of input symbols X representing, respectively, initial 
values and inputs that can be set to arbitrary values on each step of operation. Among 
the state variables, there can be immutable values expressing the behavior of functional 
units, such as ALUs, and system parameters such as the total number of processes or 
the maximum size of a buffer. Since these values are expressed symbolically, one run of 
the verifier can prove the correctness of the system for arbitrary functionalities, process 
counts, and buffer capacities. 

The overall system operation is characterized by an initial-state expression set q° and 
a next-state expression set 6 . The initial state consists of an expression for each state 
element, with the initial value of state element x given by expression G E{J). 
The transition behavior also consists of an expression for each state element, with the 
behavior for state element x given by expression <5x G E(V UX)An this expression, the 
state element symbols represent the current system state and the input symbols represent 
the current values of the inputs. The expression gives the new value for that state element. 

We will use a very simple system as a running example throughout this presentation. 
The only state element is a function F. An input symbol i determines which element of 
F is updated. Initially, F is the identify function: gp = A u . u. On each step, the value 
of the function for argument i is updated to be F(i + 1). That is, <5 f = Au . ITE{u = 
i , F ( i+1 ) , F ( u) ) , where the if-then-else operation ITE selects its second argument when 
the hrst one evaluates to true and the third otherwise. 



2.3 Concrete System 

A concrete system state assigns an interpretation to every state symbol. The set of states 
of the concrete system is given by Uy, the set of interpretations of the state element 
symbols. For convenience, we denote concrete states using letters s and t rather than the 
more formal oy. 

From our system model, we can characterize the behavior of the concrete system in 
terms of an initial state set C Uy and a next-state function operating on sets 
Nc ■ P(A7y) — >■ P(Uy). The initial state set is defined as Qq = {(q°)^^ \oj G Ej}, 
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i.e., the set of all possible valuations of the initial state expressions. The next-state 
function Nc is defined for a single state s as Nc{s) = \ax G Sx}, i.e., the set 

of all valuations of the next-state expressions for concrete state s and arbitrary input. 
The function is then extended to sets of states by defining Nc{Sc) = UseSc Nc{s). 
We can also characterize the next-state behavior of the concrete system by a transition 
relation T where (s, t) G T when t G Nc{s). 

We define the set of reachable states Rq as containing those states s such that there is 
some state sequence sq, si, . . . , Sn with sq G Qq, s„ = s, and G Nc{si) for all 
values of i such that 0 < i < n. We define the depth of a reachable state s to be the length 
n of the shortest sequence leading to s. Since our concrete system has an infinite number 
of states, there is no finite bound on the maximum depth over all reachable states. 

With our example system, the concrete state set consists of integer functions / such that 
/(tt-|-l) > f{u) > u for all u and f{u) = u for infinitely many arguments of /. 



3 Predicate Abstraction 



We use quantified predicates to express constraints on the system state. To define the 
abstract state space, we introduce a set of predicate symbols V and a set of index symbols 
X. The predicates consist of a named set 4>, where for each p G P, predicate (j>p is a 
Boolean formula over the symbols in V U A". 

Our predicates define an abstract state space S-p , consisting of all interpretations ap of 
the predicate symbols. For k = \V\, the state space contains 2^ elements. 

As an illustration, suppose for our example system we wish to prove that state element 
F will always be a function / satisfying f{u) > 0 for all u > 0. We introduce an index 
variable x and predicate symbols P = {p, q}, with <^p A f(x) > 0 and (/)q A x > 0. 

We can denote a set of abstract states by a Boolean formula xj} G E{P). This expression 
defines a set of states (^/>) A {ap \ {4’)^ = true}. As an example, our two predicates 
0p and (/)q generate an abstract space consisting of four elements, which we denote FF, 
FT, TF, and TT, according to the interpretations assigned to p and q. There are then 16 
possible abstract state sets, some of which are shown in Table 1 . In this table, abstract 
state sets are represented both by Boolean formulas over p and q, and by enumerations 
of the state elements. 

We define the abstraction function a to map each concrete state to the set of abstract states 
given by the valuations of the predicates for all possible values of the index variables: 

a(s) = Wx G Sx} (1) 

Since there are multiple interpretations ax, a single concrete state will generally map to 
multiple abstract states. This feature is not found in most uses of predicate abstraction, 
but it is the key idea for handling quantified predicates. We then extend the abstraction 
function to apply to sets of concrete states in the usual way: a(S'c) = UseSc 
can see that a is monotonic, i.e., that if Sc C Tq, then o;(S'c) ^ a{Tc). 
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Table 1. Example abstract state sets and their concretizations Abstract state elements are 
represented by their interpretations of p and q. The terms are interpreted over Z. 



Abstract System 


Concrete System 


Formula 


State Set 


System Property 


State Set 


ip 


II 


\/Xip* 


Sc = i{Sa) 


p Aq 


{TT} 


Vx : F(x) > 0 A X > 0 


0 


p A 


{TF| 


Vx : F(x) > 0 A X < 0 


0 




{FF, TF| 


Vx : X < 0 


0 


P 


{TF j-pj 


Vx : F(x) > 0 


{/|Va; ; f{x) > 0} 


p V ^q 


{FF,TF,TT| 


Vx : X > 0 =► F(x) > 0 


{/|Vx : a; > 0 => f(x) > 0} 



Working with our example system, consider the concrete state given by the function 
Xu . u. When we abstract this function relative to predicates </>p and we get two 
abstract states: TT, when x > 0, and FF, when x < 0. This abstract state set is then 
characterized by the formula p q. 

We define the concretization function 7 to require universal quantification over the index 
symbols. That is, for a set of abstract states Sa C E-p: 

i{Sa) = {s\i(Tx G : {(f))s-ax ^ '^- 4 } (2) 

The universal quantifier in fhis definition has the consequence that the concretization 
function does not distribute over set union. In particular, we cannot view the concretiza- 
tion function as operating on individual abstract states, but rather as generating each 
concrete state from multiple abstract states. Nonetheless, 7 is montonic, i.e., if Sa C Ta, 
then7(S'A) C 7 (Ta). 

Consider our example system with predicates cj)-p and (j)q. Table 1 shows some example 
abstract state sets Sa and their concretizations 7 ( 6 '^). As the first three examples show, 
some (altogether 6 ) nonempty abstract state sets have empty concretizations, because 
they constrain x to be either always negative or always non-negative. On the other hand, 
there are 9 abstract state sets having nonempty concretizations. We can see by this that 
the concretization function is based on the entire abstract state set and not just on the 
individual values. For example, the sets {TF} and {TT} have empty concretizations, but 
(Tf -pji concretizes to the set of all non-negative functions. 

Theorem 1. The functions {a, form a Galois connection *, i.e., forany sets of concrete 

states Sc and abstract states Sa- 

a{Sc) Q Sa Sc Q "/{Sa) (3) 



The proof follows by observing that both the left and the right-hand sides of (3) hold 
precisely when for every cx G Ex and every s G Sc, we have {4>)s-ax ^ 

* This is one of several logically equivalent formulations of a Galois connection [6]. 
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A Galois connection also satisfies the property (follows from (3)) that for any set of 
concrete states Sq '■ 



Sc C j{a{Sc))- (4) 

The containment relation in (4) can be proper. For example, the concrete state set con- 
sisting of the single function Xu . u abstracts to the state set p q, which in turn 
concretizes to the set of all functions / such that f{u) > 0 u > 0 . 



4 Abstract System 

Predicate abstraction involves performing a reachability analysis over the abstract state 
space, where on each step we concretize the abstract state set via 7 , apply the concrete 
next-state function, and then abstract the results via a. We can view this process as 
performing reachability analysis on an abstract system having initial state set = 
o;((5p) and a next-state function operating on sets: Na{Sa) = ck(Ac( 7 (>S'a)))- Note 
that there is no transition relation associated with this next-state function, since 7 cannot 
be viewed as operating on individual abstract states. 

It can be seen that Na provides an abstract interpretation of the concrete system [6,5]: 

1 . Na is null-preserving: A^( 0 ) = 0 

2 . Na is monotonic: Sa QTa Na{Sa) Q Na{Ta) 

3. Na simulates Nq (a simulation relation defined by a): a{Nc{Sc)) Q NA{ct{Sc)) 

We perform reachability analysis on the abstract system using Na as the next-state 
function: 



Ra = Qa (5) 

= R\UNa{R\) ( 6 ) 

Since the abstract system is finite, there must be some n such that = R^^- The 
set of all reachable abstract states Ra is then K^. By induction on n, it can be shown 
that if s is a reachable state in the concrete system with depth < n, then a(s) C 
From this it follows that a(s) C Ra for any concrete reachable state s, and therefore 
that a{Rc) C Ra. Thus, even though determining the set of reachable concrete states 
would require examining paths of unbounded length, we can compute a conservative 
approximation to this set by performing a bounded reachability analysis of the abstract 
system. 

It is worth noting that we cannot use the standard “frontier set” optimization in our 
reachability analysis. This optimization, commonly used in symbolic model checking, 
considers only the newly reached states in computing the next set of reachable states. In 
our context, this would mean using the computation R^a^ = R\ U Na{R\ ~ R^a^) 
rather than that of ( 6 ). This optimization is not valid, due to the fact that 7 , and therefore 
Na, does not distribute over set union. 
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As an illustration, let us perform reachability analysis on our example system. In the 
initial state, state element F is the identity function, which we have seen abstracts to the 
set represented by the formula p q. This abstract state set concretizes to the set of 
functions / satisfying f{u) > 0 ^ u > 0. Let h denote the value of F in the next state. 
If input i is — 1, we would /i(— 1) = /(O) > 0, but we can still guarantee that h{u) > 0 
for tt > 0. Applying the abstraction function, we get R\ characterized by the formula 
p V -■q (see Table 1.) For the second iteration, the abstract state set characterized by the 
formula p V ^q concretizes to the set of functions / satisfying f{u) > 0 when u > 0, 
and this condition must hold in the next state as well. Applying the abstraction function 
to this set, we then get R\ = R\, and hence the process has converged. 

5 Verifying Safety Properties 

A Boolean formula ip G E{V) defines a property of the abstract state space. The property 
is said to hold for the abstract system when it holds for every reachable abstract state. 
That is, {ip)^ = true for all op G Ra- 

For Boolean formula ip G E{V), define the formula ip* G E{V U X) to be the result 
of substituting the predicate expression (p^ for each predicate symbol p G V. That is, 
viewing </> as a substitution, we have ip* = ip [cp/'P], Formula ip* defines a property 
MXip* of the concrete states. The property holds for concrete state s, written \/Xip*{s), 
when {ip*)g.„^ = true for every ax G Ex - The property holds for the concrete system 
when \/Xip*{s) holds for every reachable concrete state s G Rc- Table 1 shows the 
concrete system properties given by different abstract state formulas ip. 

Theorem 2. For a formula ip G E{V), if property ip holds for the abstract system, then 
property yXip* holds for the concrete system. 

This follows by the definition of a and the fact that a{Rc) C Ra- 

With our example system, letting formula ip = pV ^q, and noting that pV^q = q=>p, 
we get the property Vx : x > 0 F(x) > 0. 

Using predicate abstraction, we can possibly get a false negative result, where we fail 
to verify a property MXip* , even though it holds for the concrete system, because the 
given set of predicates does not adequately capture the characteristics of the system 
that ensure the desired property. Thus, this method of verifying properties is sound, but 
possibly incomplete. 

We can precisely characterize the class of properties for which the predicate abstraction 
is both sound and complete, assuming we have a decision procedure that can determine 
whether a universally quantified formula in the underlying logic is satisfiable. A property 
MX Ip* is said to be inductive for the concrete system when it satisfies the following two 
properties: 

1. Every initial state s G Qq satisfies MXip*{s). 

2. For all concrete states s and t G Nc{s), if MXip*{s) holds, then so does MXip*(t). 
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Clearly an inductive property must hold for every reachable concrete state and therefore 
for the concrete system. It can also be shown that ifMXij}* is inductive, then ip holds for 
the abstract system. That is, if we present the predicate abstraction engine with a fully 
formed induction hypothesis, it can prove that it holds. 

For formula ip G E{V) and predicate set <p>, the property MXip* is said to have an 
induction proof over cp when there is some formula \ G E{V), such that x ^ 
VA’x* is inductive. That is, there is some way to strengthen ip into a formula x that can 
be used to prove the property by induction. 

Theorem 3. A formula ip G E{V) is a property of the abstract system if and only if the 
concrete property WXip* has an induction proof over the predicate set cp. 

This theorem precisely characterizes the capability of our formulation of predicate 
abstraction — it can prove any property that can be proved by induction using an in- 
duction hypothesis expressed in terms of the predicates. Thus, if we fail to verify a 
system using this form of predicate abstraction, we can conclude that either 1 ) the sys- 
tem does not satisfy the property, or 2 ) we did not provide an adequate set of predicates 
to construct an universally quantified induction hypothesis, provided one exists. 



6 Quantifier Instantiation 



For many subsets of first-order logic, there is no complete method for handling the uni- 
versal quantifier introduced in function 7 (Equation 2). For example, in a logic with 
uninterpreted functions and equality, determining whether a universally quantified for- 
mula is satisfiable is undecidable [3] . Instead, we concretize abstract states by considering 
some limited subset of the interpretations of the index symbols, each of which is de- 
fined by a substitution for the symbols in X. Our tool automatically generates candidate 
substitutions based on the subexpressions that appear in the predicate and next-state 
expressions [13]. These subexpressions can contain symbols in V, X, and I. These in- 
stantiated versions of the formulas enable to verifier to detect specific cases where the 
predicates can be applied. Flanagan and Qadeer use a similar technique [ 8 ]. 

More precisely, let tt be a substitution assigning an expression tTx G E{V U U I) 
for each x G X. Then pp jX\ will be a Boolean expression over symbols V, X, and 
X that represents some instantiation of predicate pp. For a set of substitutions 77 and 
interpretations ax G Ex and cjz G £1, we define the concretization function 7/7 as: 

ln{SA,ox,ax) = {s|V7t G 77 : (0 G S'^} (7) 

It can be seen that 777 is an overapproximation of 7 , i.e., that 7(5 'a) C ax, oi) 

for any abstract state Sa, set of substitutions 77, and interpretations ax and at. From 
(4), it then follows that 



Sc C yn{oi{Sc),ax,ai). 



( 8 ) 
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and hence the functions (a, 777) satisfy property (4) of a Galois connection, even though 
they are not a true Galois connection. 

We can use 777 as an approximation to 7 in defining the behavior of the abstract system. 
That is, define Nn over sets of abstract states as: 

Nn{SA) = {{4>[^/V])s-ax-crx ^ G G ln{SA,(yx,Ol)] (9) 

Observe in this equation that </>p [<5 /V] is an expression describing the evaluation of 
predicate (f)-p in the next state. It can be seen that Nn{SA) 3 Na{Sa) for any set of 
abstract states Sa- As long as II is nonempty (required to guarantee that Nn is null- 
preserving), it can be shown that the system defined by Nn is an abstract interpretation 
of the concrete system. We can therefore perform reachability analysis: 

Rn = Qa ( 10 ) 

= R^n^Nn{R^n) dD 

These iterations will converge to a set Rn - For every step i, we can see that R\j D R\, 
and therefore we must have Rn 3 Ra- 

Theorem 4. For a formula ip G E{V), = true for every op G Rn, then 

property 'iXip* holds for the concrete system. 

This demonstrates that using quantifier instantiation during reachability analysis yields 
a sound verification technique. However, when the tool fails to verify a property, it could 
mean, in addition to the two possibilities listed earlier, that 3) it used an inadequate set of 
instantiations, or 4) that the property cannot be proved by any bounded set of quantifier 
instantiations. 



7 Symbolic Formulation of Reachability Analysis 



We are now ready to express the reachability computation symbolically, where each step 
involves finding the set of satisfying solutions to an existentially quantified formula. On 
each step, we generate a Boolean formula p}j, that characterizes R}j. That is (p}j) = 
R\j. The formulas directly encode the approximate reachability computations of (10) 
and (11). 

Observe that by composing the predicate expressions with the initial state expressions, 
4> [q°/V], we get a set of predicates over the initial state symbols N indicating the 
conditions under which the predicates hold in the initial state. We can therefore start the 
reachability analysis by finding solutions to the formula 

p%{V) = [q°/V] 



(12) 
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The formula for the next-state computation combines the definitions of Nn (9) and 777 

(7): 



pTiT^) = Pn(r)y 



3V3X3li /\ {p}j [cf,/V]) [tt/^T] a A P ^ | • ^13) 

yTrGTT 



To understand the quantified term in this equation, note that the left-hand term is the 
formula for 777 (Tt; <7r)> while the right-hand term expresses the conditions under 
which each abstract state variable p will match the value of the corresponding predicate 
in the next state. 

Let us see how this symbolic formulation would perform reachability analysis for our 
example system. Recall that our system has two predicates (/'p = F(x) > 0 and (/)q A 
X > 0. In the initial state, F is the function Xu . u, and therefore (j)-p [q°/V] simply 
becomes x > 0. Equation (12) then becomes 3x [(p x > 0) A (q x > 0)], which 
reduces to p q. 

Now let us perform the first iteration. For our instantiations we require two substitutions 
7T and 7 t' with tTx = x and tt' = i + 1. For p^{p,q) = p q, the left-hand term of 
(13) instantiates to (F(x) > 0 x > 0) A (F(i-|-1) > 0 i + 1 > 0). Substituting 
Xu.ITE{u = i, F(i+1), F(m)) forFin(/)p gives (x = i AF(i+l) > 0) V (x^^ i AF(x) > 
0). 

The quantified portion of (13) for pjj{p, q) then becomes 

/ F(x)>0<+x>0 A F(i+1) > 0 <+> i + 1 > 0 \ 

3 F, X, i : I A p <+> [(x= i A F(i + 1) > 0) V (xy^ i A F(x) > 0)] 1 
\Aq<+>x>0 J 

The only values of p and q where this formula cannot be satisfied is when p is false and 
q is true. 

As shown in [12], we can generate the set of solutions to (12) and (13) by first trans- 
forming the formulas into equivalent Boolean formulas and then performing Boolean 
quantifier elimination to remove all Boolean variables other than those in V. This quan- 
tifier elimination is similar to the relational product operation used in symbolic model 
checking and can be solved using either BDD or SAT-based methods. 



8 Axioms 



As a special class of predicates, we may have some that are to hold at all times. For 
example, we could have an axiom f (w) >0 to indicate that function f is always positive, 
or f (y , z) = f (z, y) to indicate that f is commutative. Typically, we want these predicates 
to be individually quantified, but we can ensure this by defining each of them over a 
unique set of index symbols, as we have done in the above examples. 
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We can add this feature to our analysis by identifying a subset Q of the predicate symbols 
V to be axioms. We then want to restrict the analysis to states where the axiomatic 
predicates hold. Let 27® denote the set of abstract states <j-p where up (p) = true for 
every p G Q. Then we can apply this restriction by redefining a(s) (Equation 1) for 
concrete state s to be: 



a(s) = {{(}>) s-a:^ G 27a:} n 27® (14) 

and then using this definition in the extension of a to sets, the formulation of the reacha- 
bility analysis (Equations 5 and 6), and the approximate reachability analysis (Equations 
10 and 11). 



9 Applications 



We have used our predicate abstraction tool to verify safety properties of a variety of 
models and protocols. Some of the more interesting ones include: 

- A microprocessor out-of-order execution unit with an unbounded retirement buffer. 
Prior verihcation of this unit required manually generating 13 invariants [13]. 

- A directory-based cache protocol with unbounded channels, devised by Steven Ger- 
man of IBM [9], as discussed below. 

- A version of Lamport’s bakery algorithm [14] that allows arbitrary number of pro- 
cesses and nonatomic reads and writes. 

- Selection sort algorithm for sorting an arbitrary large array. We prove the property 
that upon termination, the algorithm produces a sorted array. 

Eor the directory-based German’s cache-coherence protocol, an unbounded number of 
clients (cache), communicate with a central home process to gain exclusive or shared 
access to a memory line. The state of each cache can be {invalid, shared, exclusive}. 
The home maintains explicit representations of two lists of clients: those sharing the 
cache line (home_sharer_list) and those for which the home has sent an invalidation 
request but has not received an acknowledgement (home_invalidate_list). 

The client places requests |req_shared, req .exclusive} on a channel ch_l and the 
home grants |grant_shared, grant.exclusive} on channel ch_2. The home also 
sends invalidation messages invalidate along ch_2. The home grants exclusive access to 
a client only when there are no clients sharing a line, i.e. Vi : home_sharer_list(i) = 
false. The home maintains variables for the current client (home.current .client) and 
the request it is currently processing (home.current.commEuid). It also maintains a bit 
home.exclusive.grauited to indicate that some client has exclusive access. The cache 
lines acknowledge invalidation requests with a invalidate_ack along another channel 
ch.3. Details of the protocol operation with single-entry channels can be found in many 
previous works including [15]. 

In our version of the protocol, each cache communicates to the home process through 
three directed unbounded FIFO channels, namely the channels ch.l, ch_2, ch_3. Thus, 
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there are an unbounded number of unbounded channels, three for each client^. It can 
be shown that a client can generate an unbounded number of requests before getting a 
response from the home. 

To model the protocol in CLU, we need to change the predicate state variable rep- 
resentation of home_sharer_list. Since the transition functions are expressed over 
quantifier-free logic, we cannot support a universal quantifier in the model. Instead, we 
model home_sharer_list as a set, using (1) a queue hsl_q = (q, hd, tl) to store all 
cache indices i for which home_sharer_list(i) = true and (2) an array hsl_pos 
to map a cache index i to the position in the queue, if i G hsl_q. This representation 
can support addition, deletion, membership-check and emptiness-check, which are the 
operations required for this protocol. In addition, this representation also allows us to 
enumerate the cache indices for which home_sharer_list(i) = true. 

We had previously verified the cache-coherence property of the protocol with 3 1 non- 
trivial, manually constructed invariants. In contrast, the predicate abstraction constructs 
the strongest inductive invariant automatically with 29 predicates, all of which are simple 
and do not involve any Boolean connectives. There are 2 index variables in X to specify 
the predicates. The abstract reachability took 19 iterations and 263 minutes to produce 
the inductive invariant For the simpler version which has single-entry channels for 
communication, our method finds the inductive invariant in 85s using 17 predicates in 
9 iterations. All experiments were performed on a 2.1 GHz Linux machine with 1GB 
of RAM. The main difficulty of making the channels unbounded is the presence of 
two-dimensional arrays in the model, and additional state variables for the head and tail 
pointers for each of the unbounded queues. 

For space considerations, we will only describe the nature of predicates used for the 
model with single-entry channels. A few predicates did not require any index symbol. 
These include: home_exclusive_granted, home_current_command = req_SHARED, 
home_current_command = req_exclusive, hd < tl and hd = tl. For most pred- 
icates, we required a single index variable i G X, to denote an arbitrary cache in- 
dex. They include: home_invalidate_list(i), cache(i) = exclusive, cache(i) = 
shared, ch_2(i) = grant_exclusive, ch_2(i) = grant_shared, ch_2(i) = invali- 
date, ch_3(i) = INVALIDATE_ACK and i = q(hd). We also required another index vari- 
able j G A" to range over the entries of the queue hsl_q. The predicates over j are hd 
< j and j < tl. Finally, to relate the entries in hsl_q and hsl_pos, we needed the 
predicates i = q(j) and j = hsl_pos(i). 

Most of the predicates are fairly easy to find from the model and from counterexamples. 
Predicate abstraction constructs an inductive invariant of the form Vi,j : i/)*(i,j), 
which implies the cache-coherence property. This implication is checked automatically 
with a sound decision procedure in UCLID [4], using quantifier instantiation. 

Previous attempts at using predicate abstraction (with locally quantified predicates), 
for a version of the protocol with single-entry channels required complex quantified 
predicates [7,2], sometimes as complex as an invariant. However, Baukus et al. [2] 



^ The extension was suggested by Steven German himself 
^ There is a lot of scope for optimizing the performance of our procedure. 
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proved the liveness of the protocol in addition to the cache-coherence property. Pnueli 
et al. [15] have used the method of invisible invariants to derive the inductive invariant 
for the model with single-entry channels, but it is not clear if their formalism can model 
the version with unbounded channels per client. 



10 Conclusions 

We have found quantified invariants to be essential in expressing the properties of systems 
with function state variables. The ability of our tool to automatically generate quantified 
invariants based on small and simple predicates allows us to deal with much more 
complex systems in a more automated fashion than previous work. A next step would 
be to automatically generate the set of predicates used by the predicate abstraction 
tool. Other tools generate new predicates based on the counterexample traces from the 
abstract model [1,7]. This approach cannot be used directly in our context, since our 
abstract system cannot be viewed as a state transition system, and so there is no way to 
characterize a counterexample by a single state sequence. We are currently looking at 
techniques to extract relevant predicates from the proof of unsatisfiable formulas which 
represent that an error state can’t be reached after any finite number of steps. 
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Abstract. Given a finite-state abstraction of a sequential program with 
potentially recursive procedures and input from the environment, we 
wish to check statically whether there are input sequences that can drive 
the system into “bad/good” executions. Pushdown games have been used 
in recent years for such analyses and there is by now a very rich literature 
on the subject. (See, e.g., [BS92,Tho95,Wal96,BEM97,Cac02a,CDT02].) 
In this paper we use recursive game graphs to model such interprocedural 
control flow in an open system. These models are intimately related to 
pushdown systems and pushdown games , but more directly capture the 
control flow graphs of recursive programs ([AEY01,BGR01,ATM03b]). 
We describe alternative algorithms for the well-studied problems of de- 
termining both reachability and Biichi winning strategies in such games. 
Our algorithms are based on solutions to second-order data flow equa- 
tions, generalizing the Datalog rules used in [AEYOl] for analysis of re- 
cursive state machines. This offers what we feel is a conceptually simpler 
view of these well-studied problems and provides another example of the 
close links between the techniques used in program analysis and those of 
model checking. 

There are also some technical advantages to the equational approach. 
Like the approach of Cachat [Gac02a], our solution avoids the necessar- 
ily exponential-space blow-up incurred by Walukiewicz’s algorithms for 
pushdown games. However, unlike [Gac02a], our approach does not rely 
on a representation of the space of winning configurations of a pushdown 
graph by (alternating) automata. Only “minimal” sets of exits that can 
be “forced” need to be maintained, and this provides the potential for 
greater space efficiency. In a sense, our algorithms can be viewed as an 
“automaton-free” version of the algorithms of [Gac02a] . 



1 Introduction 

There has been intense activity in recent years aimed at extending the scope of 
model checking to finite-state abstractions of sequential programs described by 
modular and potentially recursive procedures. A partial list of references includes 
[BS92,Wal96,BEM97,Rep98,EHRS00,BR00, AEYOl, BGR01,Cac02a,CDT02]. 
Pushdown systems are one of the primary vehicles for such analyses. When such 
models are analyzed in the setting of an open system, where the environment 
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is viewed as an adversary, a natural model to study becomes pushdown games. 
There is by now a very rich literature on the analysis of pushdown games (see, 
e.g., [Cau90,BS92,Tho95,Wal96,BEM97,Cac02a,CDT02].) 

In this paper we use recursive game graphs to model such interprocedural 
control flow in an open system. These models are intimately related to push- 
down systems and pushdown games, but more directly capture the control flow 
graphs of recursive programs. Recursive state machines (RSMs) were introduced 
and studied in [AEYOl] and independently in [BGROl], and are related to sim- 
ilar models studied program analysis (see, e.g., [Rep98]). Besides giving algo- 
rithms for their analysis, they showed that RSMs are expressively equivalent 
to pushdown systems, with efficient translations in both directions. More re- 
cently [ATM03b,ATM03a] have studied “modular strategies” on Recursive Game 
Graphs (RGGs), a natural adaptation of RSMs to a game setting. The transla- 
tions of [AEYOl, BGROl] can easily be adapted to show that pushdown games 
and (labelled) RGGs are expressively equivalent. 

The results of Walukiewicz [Wal96] were a key watershed in the analysis of 
pushdown games. Besides much else, he showed that determining the existence 
of winning strategies under any parity encodable winning condition is in EXP- 
TIME, and that existence of winning strategies under simple reachability win- 
ning conditions is already EXPTIME-hard. In Walukiewicz’s algorithm one first 
constructs from a pushdown game P an exponentially larger (flat) game graph 
Gp. A winning strategy in Gp corresponds to a winning strategy in P, and one 
then solves Gp via efficient algorithms for flat games. The disadvantage of such 
an algorithm from a practical point of view is that it can not get “lucky” : expo- 
nential space is consumed in the first phase on any input, even if the game P may 
be very “simple” to solve. Subsequently, many others have studied algorithms for 
analysis of pushdown systems and pushdown games. One effective approach has 
been based on the observation that the reaching configurations of a pushdown 
system form a regular set ([FWW97,BEM97,EHRS00]). This approach has been 
used more recently by Gachat and others [Gac02a,GDT02,Gac02b,Ser03] to give 
alternative algorithms for analysis of pushdown games. These algorithms do not 
necessarily incur the exponential space blow-up incurred by Walukiewicz’s algo- 
rithm, but they do require construction of an alternating automaton accepting 
the winning configurations of a pushdown game. 

We describe alternative algorithms for determining both reachability and 
Biichi strategies on RGGs. Our algorithms do not make use of any automata to 
represent global configurations of the underlying game graph, but are instead 
based on solutions to second-order data flow equations over sets of exit nodes 
in the RGGs. This generalizes the Datalog rules used in [AEYOl] for analysis of 
recursive state machines. It is also closely related to the algorithm in [ATM03b] 
for computing “modular strategies” for reachability on recursive game graphs. 
Gomputing modular strategies is NP-complete, whereas computing arbitrary 
strategies is EXPTIME-complete, so our algorithms necessarily differ. But our 
underlying ideas for the reachability algorithm are closely related to theirs, and 
both can be viewed as generalizations of the approach of [AEYOl]. 
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The dataflow equation approach offers what we feel to be a conceptually 
simpler solution to these well-studied problems. It also provides another example 
of the close links between the techniques used in program analysis and those of 
model checking. In a sense, our algorithms can be viewed as an “automaton- 
free” version of the algorithms of [Cac02a], although they arose in an attempt 
to generalize the approach of [AEYOl]. 

There are some technical advantages to be gained from using the equational 
approach. Unlike the automaton approach, in which global winning configura- 
tions need to be recorded as accepted strings of an alternating automaton, in 
our approach only “minimal” sets of exits that can be “forced” need to be main- 
tained and this provides the potential for greater space efficiency when one is 
interested in, e.g., whether particular vertices can be reached under any “con- 
text” . Also, a formulation based on solutions of data flow equations allows for 
the application of well established techniques in program analysis for efficient 
evaluation, such as worklist data structures to manage sets that require updates 
(see, e.g., [NNH99,App98]). 

The sections of the paper are as follows: in section 2 we provide background 
definitions on recursive game graphs, in section 3 we provide our algorithm for 
RGGs under a reachability winning condition, in section 4 we extend these to 
Biichi conditions, and we conclude in 5. 

2 Definitions 

Syntax. A recursive game graph (RGG) A is given by a tuple (Ai,...,Afe), 
where each component game graph Ai = {Ni U Bi,Yi, Eui, Exi, 6i) consists of 
the following pieces: 

— A set Ni of nodes and a (disjoint) set Bi of boxes. 

— A labelling Yi : Bi {1, . . . , k} that assigns to every box an index of one of 
the component machines, Ai, , A^. 

— A set of entry nodes Em C Ni, and a set of exit nodes Exi C Ni. 

— A transition relation Si, where transitions are of the form (u,v) where: 

1. the source u is either a non-exit node in Ni \ Exi, or a pair (5, x), where 
6 is a box in Bi and x is an exit node in Exj, where j = Yi{b); 

2. the destination v is either a non-entry node in Ni \ Em, or a pair (6, e), 
where 6 is a box in Bi and e is an entry node in Euj, where j = Yi(b). 

Let N = denote the set of all nodes. For a box b in Ai and an entry 

e of Yi{b), we define the pair (6, e) to be a call. Likewise, for an exit x of Yi{b), we 
say (b,x) is a return. We will use the term vertex to refer collectively to nodes, 
calls, and returns that participate in some transition, and we denote this set 
by Q = Qi. That is, the transition relation Si is a set of labelled directed 
edges on the set Qi of vertices of Ai. Let 5 = UiSi be the set of all edges in A. In 
addition to the tuple (Ai, . . . , AQ, we are also given a partition of the vertices 
Q into two disjoint sets and Q^, corresponding to where it is player O’s and 
player I’s turn to play, respectively. 
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Semantics. An RGG defines a global game graph Ta = (V,Vo,Vi, A). The 
global states V of Ta, for RGG A, are tuples (&i, . . . , br, u), where bi, . . . ,br are 
boxes and u is a vertex (not just a node). The global transition relation A is 
given as follows: let s = {b\, . . . ,br,u) be a state with u a vertex in Qj, and 
br G Bm- Then, (s, s') G A iS one of the following holds: 

1. (u, u') G Sj for a vertex u' of Aj, and s' = {b\, . . . ,br,u') . 

2. M is a call vertex u = (6', e), of Aj, and s' = {bi, . . . , br, b', e). 

3. u is an exit-node of Aj, and s' = {b\, . . . , br-i, {br, u)). 

Gase 1 corresponds to when control stays within the component Aj, case 2 is 
when a new component is entered via a box of Aj, case 3 is when the control 
exits Aj and returns back to Am- 

The global states are partitioned into sets Vq and Vi, as follows: s = {bi, . . . ,br,u) 
is in Vo (Vi) A u G {u G Q^, respectively). We augment RGGs with an 
acceptance condition T . We restrict ourselves to Biichi conditions F C N. The 
global game graph Ta, together with a start state init G N and acceptance 
condition F, define an (infinite) game graph with an acceptance condition, Ba = 
{V,Vq,Vi, A, (init), F*), where F* = {{b,v) \ v G F}. Ba defines a game as 
follows. The game begins at sq = {init). Thereafter, whenever we are at a state 
Si in Vq (hi). Player 0 (respectively. Player 1), moves by choosing some transition 
(si,Si+i) G A. A run (or play) tt = sqSi . . . is constructed from the infinite 
sequence of moves. The run tt is accepting if it is accepted under the Biichi 
condition F*, meaning for infinitely many indices i, Si G F*. We say that player 
0 wins if 7T is accepting, and 1 wins if tt is not accepting. A player is said to have 
a winning strategy in the Biichi game if it can play in such a way that it will win 
regardless of how the other player plays (we omit a more formal description of 
a winning strategy). We will also be interested in simpler reachability winning 
conditions. Given A, a vertex u G Q, and a set of nodes Z C N, define u Z 
to mean that player i has a strategy to reach some state {bi, . . .br,v) such that 
V G Z, from the state (u) in the global game graph Ta- We will be interested in 
two algorithmic problems: 

1. Winning under reachability conditions: Given A, a vertex u G Q, and a set 
of nodes Z C N, determine whether u Z. 

2. Winning under Biichi conditions: Given A, init G N, and F C N, determine 
whether one player or the other has a winning strategy in the game defined 
by Ba under Biichi acceptance conditions. 



3 Algorithm for the Reachability Game 

To check whether v Z, we will incrementally associate to each vertex v of 
component Ai a set-of-sets RSetz{v) C 2^“* of exit nodes from Exi- Usually, 
when Z is clear from the context, we write RSet{v) instead of RSetz{v). The 
empty set {} will eventually end up in RSet{v) iff v Z. We first make a more 
general definition. For u a vertex in component Ai, and X C Exi, and for an 
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arbitrary set of nodes Z C N, We write u (Z,X), to mean that player 0 
can “force the play”, starting at (u), to either reach a global state (x) for some 
X G X, or else to reach a global state {bi, . . . ,bk, z), such that z G Z. Note that 
u Z if and only if u {Z,%). As we will see, some subset Y C X will 
eventually end up in RSet{u) if and only if u {Z, A). The algorithm to build 
RSet{vys proceeds as follows: 

1 . Initialization: for every z G Z, we initialize RSet{z) := {{}} to the set 
containing only the empty set. To each exit point x G Exi, not in Z, in any 
component Ai, we associate the set RSet{x) := {{x}}. To all other vertices 
V we initially associate RSet{v) := {}. 

2 . Rule application: Inductively, we make sure the relationships defined by 
rules (a), (b), and (c), below, hold between the sets associated with vertices, 
based on a standard fixpoint iteration loop. We say an instance of a rule 
is applicable if the right hand side does not equal the left hand side. While 
there is an applicable instance of a rule, we apply it. For S C 2 ^ , over any 
base set D, let 

Min{S) = {A G 5 I V A' G 5 if A' C A then A' = A} 

In other words, Min{S) contains the minimal sets X G S such that there is 
no A' G 5 which is strictly contained in A. 

a) For every 0 - vertex u G that isn’t an exit, isn’t a call, and isn’t in Z: 

RSet{v) := Min{ RSet{w)) 

(y^w)^5 

In other words, the set of sets associated with v is the union of the set of 
sets associated with its successors, but only retaining the minimal sets 
among these. So if it has two successors w\ and W2 with RSet{w\) = 
{{a;i},{x2,a:3}} and RSet{u>2) = {{3^2}} then RSet{v) = {{xi}, {a;2}}. 

b) For every 1 -vertex v G Q^, with successors {wi, ...,Wk}, where v is not 
an exit, isn’t a call, and v ^ Z: 

RSet{v) := Min{{{ Aj) | Vi : Aj G RSet{wi)'\) 

In other words, the set of sets associated with each 1 -vertex v will con- 
sist of all possible unions of sets of its successors, retaining only mini- 
mal sets among these. So, if RSet{wi) = {{a:i}, {0:3}} and RSet{w3) = 
{{xi,X2},{x2,X3.}}, then 

RSet{v) = Min{{{xi,X2},{xi,X2,X3.},{x2jX3.}}) = {{xi,X2},{x2,Xz}}- 
Note that if for some Wj, RSet{wj) = {}, then RSet{v) := {}. 

c) For every call vertex (b, e^), where is an entry point of Ai, Ai has exit 
set Exi = {xi, ...,Xk}, and b is in component Aj, where Yj{b) = i: 

RSet{{b, Ci)) := Min{{ Xi, ,^ \ X G RSet{ei) & A^^a, G RSet{{b,x))}) 
xex 
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In other words, the set of sets RSet{{b, Ci)) C 2^^^ associated with the 
call {b,6i) consists of the union, for each set X G RSet{ei) C of 

the sets X(^h,x) G RSet{{b,x)), where x G X, and (b,x) is a return of box 
b. By convention, ii X = {}, then Ua:ex ^b,x = {}• So empty sets carry 
over from RSet{ei) to RSet{{b,ei)). 

For any two sets-of-sets, S, S' C 2^, over any base set D, we will say that S' 
covers S from below, and denote this by S' U S, iff for every set AT in 5 there 
is a set Y in S' such that Y C X. Note that the empty set {} is covered from 
below by any other set, and that the set {{}} containing only the empty set 
covers every set from below. (We omit the lattice-theoretic formulation.) 

Theorem 1. For a vertex u of component Ai, a set Z C N, and a set X C Exi: 

1. Let RSet(r() be the set associated with u at some point during the algorithm, 
and RSet^(r() be the set associated with u some time later. Then RSet^(u) U 
RSet(u). 

2. At any time during the algorithm RSet(u) is minimal, i.e., ifYG RSet(u) 
there is no strict subset Y' C Y, with Y' G RSet(u). 

5. The algorithm will halt, and RSet(rt) will get updated at most times. 

4-. (*) Some subset Y C X will eventually end up in RSet(rt) if and only if (**) 
u ^0 {Z,X). 

In particular, letting X = % in (4.), u Z if and only if when the algorithm 
halts {} is in RSet(rt). 

Proof. Claim (1.) asserts a monotonicity property of these rules. Namely, sup- 
pose for vertex u, RSet{u) depends on RSet{wi) for immediate “neighbours” 
{wi, . . . ,Wk} (if u is a call, these can be either return vertices of the box, or 
entries of the corresponding component). If we are about to update RSet{u) us- 
ing the RSet! sets associated with its neighbours, suppose that for some neigh- 
bours, RSet'{wi) U RSet{wi), while for others (that haven’t been updated) 
RSet'{wi) = RSet{wi). We can show by inspection that applying each rule re- 
sults in a set RSet'{u) such that RSet' (u) U RSet{u). 

Claim (2.) follows from both the initial settings of RSet{u)’s, and the fact that 
each of our update rules retains only minimal sets. 

Claim (3.) follows, because by claims (1.) and (2.) successive sets associated 
with a vertex will always cover prior ones from below. Hence, since U defines a 
partial-order on these sets, there are no non-trivial chains that repeat the same 
set. Updates are only performed on applicable rule instances, i.e., when the re- 
spective set changes. Hence at most updates are performed on RSet{u). 

Claim (4.): the easier direction is (*) — >■ (**): Suppose Y C X, and Y G RSet{u). 
By the “soundness” of our rules, there is a way for player 0 to force the game, 
starting at {u), either into a state {y) for y G Y, or else into (b,z), for some 
z G Z. “Soundness” means: both the initial setting of RSet{uys, and every rule 
application preserve the following invariant: if U G RSet{u), then u {Z, Y). 
(**) — >■ (*): Suppose u (Z,X). Consider Player O’s winning strategy as a 
finite tree T^m with root (u), leaves labelled by two kinds of states, either of the 
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form {x), for x £ X, ov of the form (&i , . . . ,br, z), such that z £ Z . An internal 
1-state of T^^in has as its children ALL its neighbours in T^, while an internal 
0-state has a single child, one of its neighbours in T^. In addition, no non- leaf 
(internal) node is of the form {bi, . . . ,br, z), where z € Z. 

For s = {bi, . . . ,br,u), let vertex(s) = u. Extending the notation, for a set of 
states S let, vertex(S) = {vertex(s) \ s € S'}. Consider a particular occurrence 
of s = (bi, . . . ,br, u) in T^in, where u G Qi} We inductively define a “cut off 
subtree” of T^m, whose root is (this) s and such that for every state s' of 
if & 1 , . . . , is a prefix of every child of s' (and vertex(s') ^ Z) then every 
child of s' in is also a child of s' in (Note: either every child of s' 

has bi, . . . ,br as a prefix, or none does because in that case s' = {b\, . . . ,br, ex) 
and ex G Exi). Note that when s = (u) is the root of T^m, = Twin- For a 
finite strategy tree T, let leafexits{T) = {s | s is a leaf of T, and vertex{s) ^ Z}. 
We will show, by induction on the depth of that for every state s in 

Twin, RSet{vertex{s)) will eventually contain a set Y C vertex{leafexits{T^^^)). 
Applying this to the root (u) of Twin, yields that (**) — >■ (*). 

Base case: if depth of is 0, then the only state of is s = {bi, br, u), 
and either u G Z, in which case RSet{u) := {{}}, or else u is an exit node, 
in which case RSet{u) := {{u}}- In either case RSet{vertex{s)) will contain 
Y = vertex{leafexits{T^^j^)). Inductively: suppose the depth of is n, with 
root (6i, . . . , br, u). Let the children of the root be si, . . . , Sm- Each child Sj is 
itself the root of a cut-off subtree TY^ of depth < n — 1 for player 0, and 
thus by induction for each i, RSet{vertex{si)) will eventually contain a set F) C 
vertex{leafexits{T^^^)). We show that RSet{u) will “after one update” contain 
a F C vertex{leafexits{T^^^)). There are several cases based on w. 

1. M is an exit node. In this case will always be a trivial 1 node tree. The 

set RSet{u) will always be the same as its initial setting, and by the same 

argument as the base case of our induction, RSet{u) will contain the set 

vertex{leafexits{T^^ri ) ) • 

2. u is in Z. In this case, again, will always be a trivial 1 done tree. The 

set RSet{u) will always be {{}}, and so RSet{u) will always be equal to 

vertex{leafexits{T ^^^ ) ) . 

3. M is a non-exit 0- vertex of component Ai. s must have one child s' in the 
strategy tree If s — >■ s' is a move of within Ai (i.e., vertex{s) is not a 
call), then by the inductive claim RSet{vertex{s')) will eventually contain a 
subset of vertex{leafexits{T^in)). Thus, by rule (a) of the algorithm, RSet{u) 
will “one update later” also contain a set F C vertex{leafexits{T^iri)), but 
since leafexits{T^iri) = ^e®/e 2 :tts(T.^j„), we have what we want. If, on the other 
hand, the move from s — >■ s' was a call meaning e = vertexes') is an entry of 
Aj, while u = (6, e) is a call, then by induction RSet(e) will eventually contain 
a F C vertex{leafexits{T^^^)). Consider this F = {ex\, . . . ,exc} Q Exj, 

^ There could be multiple occurrences of the same state s of Ta in Twin, but we assume 
that s is being somehow identified uniquely. We could do this by providing, together 
with s, the specific “path name” to the node s in Twin- 
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and consider leafexits{T^^^) = {si,...,Sc} in the tree T^jm- Each Si has 
exactly one child s' such that vertex{s'j) = (b,exi). Since is a proper 
subtree of by induction each RSet{{b,exi)) will eventually contain a 

set Yi C vertex{leafexits{T^^^)). Now, using the fact that eventually, RSet{e) 
will contain Y and that each RSet{{b,eXi)) will contain Yi, we can use rule 
(c) to obtain that “one update later” RSet{{b,e)) will contain a subset of 

yJiVertex{leafexits{T^^^)), which itself is a subset of vertex{leafexits{T^i„)). 

4. M is a non-exit 1-vertex of component Ai. In this case, without loss of general- 
ity, we can assume the only possibility is that s must have children s^ , . . . , s^ 
in the strategy tree T^jin, and all s — >■ s' moves are moves of Ta within Ai (i.e., 
not a call,, because call moves have only 1 successor in Ta, and hence calls 
can be viewed as 0-vertices). Then by the inductive claim RSet{vertex{s'i)) 

will eventually contain a subset Yi of vertex{leafexits{T^\^)) . Thus, by rule 
(6) of the algorithm, RSet{u) will “one update later” also contain a set 
Y C UiVertex{leafexits{T^-^)), but since leafexits{T^^„) = Uileafexits{T^^^) , 
we are done. 

□ 

Let max Ex = maxj \Exi\. For an upper bound on the running time of the 
algorithm, observe that each RSet{u) gets updated at most times. Each 

rule application can be done in time at most where m is the max- 

imum number of “neighbouring” vertices v' , for any vertex v, such that RSet{v) 
depends directly on RSet{v') in some rule (the time taken for rule updates can 
be heavily optimized with good use of data structures common in program anal- 
ysis). m is clearly upper bounded by the number of vertices, but is typically 
much smaller. There are |A| vertices, so the worst case running time will be 
|A| • However, observe that this worst-case analysis can be very 

pessimistic, as the retained minimal sets may converge to a fixpoint well before 
RSet{v)'s ever grow large. That is the principle advantage, and hope, offered 
by our equational approach. Of course, it will require much experimentation to 
determine under what circumstances this advantage materializes. 



4 Algorithm for the Biichi Case 

In the algorithm for Biichi conditions, the set-of-sets BSet{u) that we associate 
with each vertex will be more elaborate than just subsets of the exits. Recall 
maxEx is the maximum number of exits of any component in A. Let Calls be the 
set of calls, i.e., call vertices {b, e), in the RGG. Let GoodV ertices = EVJ Calls, 
be the union of the set of accept vertices (nodes) and call vertices in A. Let 
maxMeasure = {\GoodVertices\ ■ 4™““^^“) -g l. 

Lets briefly sketch the intuition for the algorithm and proof. 
For each vertex u of Ai, BSet{u) will contain sets of the form: 
{{Y,m), (exi,tvali), . . . , (exk,tvalk)}, where m < maxMeasure, where 
each exj G Exi, and where each tvalj G {true, false}. The set can be inter- 
preted to mean that player 0 can play starting at (u) in such a way that, no 
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matter what player 1 does, we either will visit an accept node m times during 
our play, or else we will reach a state {eXj), and we will do so having visited 
an accept node along the way if tvalj = true. The rules will be used to update 
these sets in a consistent way. What we will show is that {(J-,maxMeasure)} 
enters BSet{u) iff there is a finite strategy tree rooted at (u) such that on every 
path in the strategy tree we necessarily repeat a vertex, having visited an accept 
state in between visits, with the stack getting no smaller in between visits, and 
such that we are in the same “context of obligations” (to be made precise) in 
both visits. This allows us to repeatedly apply the same substrategies in order 
to achieve an infinite winning strategy tree. 

Let Pi = {{ex,tval) \ ex G Exi & tval € {true, false}}. Let J = {(T,m) | 
1 < m < maxMeasure}. We say a set X G is well-formed if both the 

following conditions hold: (1) For all ex G Exi, {ex, true) ^ X or {ex, false) ^ X 
(or both). (2) If (T, m) G X then for all m' yf m, (T, m') ^ X. Let Ti be the set 
of of all well-formed sets in We will refer to X G as a set of type Ti, 

and similarly we will refer to sets-of-sets S Cn as having type Ti. For S, S' G Ti, 
we write S" ^ S' iff (a),(6),(c) hold: (a) if for ex G Exi, {{ex, false) ^ S and 
{ex, true) ^ S), then {{ex, false) ^ S' and {ex, true) ^ S'), {b) if {ex, true) G S, 
then {ex, false) ^ S', (c) if (T, j) G S then (T, j') G S' such that f > j. 

For a set S C n let Min{S) denote the set of those sets X G S that are minimal 
with respect to ^ in S. Given a set X of type Ti, let Increment{X) be the 
smallest set such that: if {ex, tval) is in X, then {ex, true) is in Increment{X), 
and if (T,j) is in X, then {E,min{j 1, max Measure)) is in Increment{X). 
Extending the notation, for S C n, let Increment{S) = {Increment{X) \ X G 
S}. 

For sets Xi,...,Xfe, each of the same type r, we define a “boxy union” 
X' = Uie{i k} to be a subset of X = lJie{i fe} follows: if element 
{ex, true) and {ex, false) are both in X, then only {ex, false) is in X'. Moreover, 
there is a (T,j') in X' if j' is the minimum value j for which (T, j) is in X. 
Intuitively, “boxy union” reflects choices optimal for the adversary (player 1). 

We will associate to each u G Qi a set BSet{u) C n, such that BSet{u) will 
eventually contain {{E, maxMeasure)} if and only if player 0 has a winning 
strategy in the Biichi game. Let E be the set of accepting nodes. 
Initialization: For each z G F, initially {(T, 1)} G BSet{z). Moreover, for each 
X G Exi, initially BSet{x) contains {{x,tval)}, where tval = true if x G F, and 
otherwise tval = false. For all other vertices v, we initialize BSet{v) := 0. 

Rule application: we make sure the following relationships hold: 

1. For every 0- vertex v, with the exception of exits or calls, or nodes in F: 

BSet{v) := Min{ BSet{w)) 

{v,w)g5 

2. For every 0- vertex v that isn’t a exit or call, but is in F, 

BSet{v) := Min{Increment{ U BSe(W)u{{(J-,l))}) 

{v,w)g6 
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3. For every 1-vertex v with successors {wi, Wk} {v not an exit or call), such 
that V is not in F: 

BSet{v) := Min{{ | | Xi \ ^i : Xi £ BSet{wi)}) 

4. For every 1-vertex v with successors {wi, ...,Wk} {v not an exit or call), such 
that V is in F: 

BSet{v) Min{Increment{{ | | Xi \ \fi : Xi € BSet{wi)}) U {{(-L, 1)}}) 

5. For a call (6, e^) in component Aj, where is an entry point of Aj, and 
where Ai has exits {x\, ...,Xk}, 

BSet{{b, e*)) := 

|_| Incrx,x{Xb,x)) LI Dx)\X G BSet{ei) & Xb,x G BSet{{b, x))}) 
xex 

where Dx = {(-L,j) | (-L> j) G and where Incrx,x{X') is equal to X' if 
there is an element {x, false) in X, and otherwise is Increment{X'). 

We call a rule application that changes the value of some BSet{v), an update. 
Theorem 2. The algorithm halts after at most |Ap • updates to each 

BSet(f). When it halt, {{dL,maxMeasure)} G BSet(rt) if and only if player 0 
has a winning strategy in the Biichi game Ba starting at (u). 

For the proof please see the appendix. For the worst-case time complexity: 
each update can be carried out in time at most ^ where 

m is the maximum number of “neighbouring” vertices of any vertex. So the 
total running time is The algorithm as described will 

always require at least maxMeasure updates to reach {{A, max Measure)} in 
some BSet{v). The algorithm can be reformulated to avoid this. We omit such 
a reformulation. 

5 Conclusions 

We have provided alternative algorithms, using second-order data flow equa- 
tions, for determining whether a player has a winning strategy on recursive 
game graphs, a model that is expressively equivalent to pushdown games. Our 
algorithms generalize the approach of using Datalog rules for analysis of recur- 
sive state machines from [AEYOl], as in [ATM03b], and they can also be viewed 
as a “automaton-free” version of the algorithms given by [Cac02a] for push- 
down games. Several extensions of Cachat’s work have appeared in more recent 
literature. [CDT02] extends the algorithms to check properties such as “stack 
boundedness” of infinite plays (a notion which was studied for runs of recursive 
state machines in [AEYOl]). Also, [Ser03,Cac02b] extends the work to games 
with parity conditions. It may be possible to carry out these extensions in our 
equational framework, but we have not done so here. 
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A Proof of Correctness for Biichi Case 

Proof, (of Theorem 2) Due to space we state several lemmas without proof. 
First, (= 1 >). We show that if {(_L, maxMeasure)} ends up in BSet{u) then Player 
0 has a winning strategy in the game (this is the harder direction) . Suppose S = 
{(_L, maxMeasure)} does end up in BSet{u). Our first step is to construct a “wit- 
ness” to this in the form of a straight-line program, W, using the sets BSet{v). 
A witness W for u has the form: (1, Ci, Prei)(2, C 2 , Prc 2 ) . . . (^ C, Prei). It 
consists of I lines of the form {d,Cd, Pred), with d the line number, and: 

— Cd is the content, and has the form {v, S) where v € Qi is a, vertex of A, 
S G Tj, and there exists S' G BSet{v) such that S' ^ S. 

If (_L,to) G S, we say line d “contains a measure”, and its measure is m. 

— Pred is a (possibly empty) list [di , . . . , dk] of predecessor line numbers, with 
di < d for each i. 

Morever: 

— No two lines have the same content, thus content determines line number. 

~ A line with no predecessors is called an initial line. If the d’th line is initial, 

then it must be either of the form: {d, (v, {(_L, 1)}), []), where v £ F, or of 
the form (d, {ex, {{ex, tval)}), []) where ex is an exit, and tval is true if and 
only if ex G F". 

— If the d’th line has predecessors, it has the form {d, {v, S), [di, . . . , di]), where: 

• The content {v, S) is implied in one step by the content of its predeces- 
sors. By this we mean that if the contents of the dj are, respectively, 
{vj,Sj), then there is a rule TZ such that if Sj G BSet{vj) for each j, 
then one application of TZ would put S' G BSet{v), where S' ^ S. 

E.g., if r: = {b,e) is a call, and S = {{J-,m), {exi,tvali), . . . , 
{exk, tvalk)}, then di will be a line with content (e, S'!) and dj, for j > 1 , 
will have content {{b, ex'^), Sj), such that {ex'ptval) is in S\, moreover 
so that S = Uie{ 2 ,...,i} {(-*-: ''^i') | {-L,m') G S'!}. 

• Moreover, the measures (T,m') in any of the content sets Si,..., Si, 
of lines d\,...,di, should be as weak as possible in order to imply S, 
meaning that it should not be possible to decrease the measure in any 
single Sj and still imply S in one step. 

~ The last line, line I, is of the form {I, {u,{{J-,maxMeasure)}), Prei). 

Lemma 1. If {{F, max Measure)} G BSet(«.) then a witness W for u exists. 

The lemma can be proved by induction on the number of rule applications it took 
for {{F, maxMeasure)} to enter BSet{u). We will use W to define a winning 
strategy tree for player 0. To do this, we will first need some facts about W. By a 
path from a line d to a line d' , d > d' , in W, we mean a sequence 7 = di, . . . , d„, 
such that d = di, d' = dn, and d^+l G Prcdi for alH G {1, . . . , n — 1}. (Note: the 
path goes from higher to lower line numbers.) 
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Lemma 2. On any path 7 = di, ... ,dn in W, the measure is non-increasing, 
meaning, fori < j, if line di contains a measure m then line dj either contains a 
measure m' < m, or does not contain a measure, and if line di does not contain 
a measure then neither does line dj . 

Proof. A line has a measure iff any of its predecessors has a measure. Moreover, 
because of the minimality constraint on predecessor measures in W, the measure 
of a line {d,{v, S), PrCd) is the same as that of its predecessors that contain a 
measure unless v is either a call or an accept node (i.e., a good vertex), in which 
case the measure can be 1 greater. The measure is therefore non-increasing. □ 

For a path 7 in W , let goodLines{"f) be the number of lines of 7 of the form 
(d, (v, S), Prcd) where v G GoodVertices, i.e., v is either a call vertex or an ac- 
cept node. Let d and d' be two lines such that there is a path from d to d' in W, 
and let goodDisty^/{d,d') = min{goodLines{"f) | 7 is a path from d to d' in W}. 
For two lines {d,{v,S),Pred) and {d' ,{v' , S'), Prcd') that both contain a mea- 
sure, we say the contents Cd and C'^ are the same except for the measure if 

V = v', and S and S' differ only by the fact that (J-,m) G S, and (_L,m') G S', 
with m yf m' . In such a case, we write Cd «C C'^, if m < m' . 

Lemma 3. Let d he any initial line of W of the form {d, {v, {{J-, !)}),[]) and let 
I be the last line number, then 

1. goodDist^y (?, d) > maxMeasure. 

2. On any path 7 = I = drdr-i . . . di = d from I to d, there exist two distinct 
lines di and dj, i > j, such that Cdj «C Cd,. 

Proof. (1) The measure at I is maxMeasure, while the measure at d is 1. Thus, 
in a path 7 from line I to d, since we must have maxMeasure — 1 opportunities 
to decrement the measure, and can only do so on good lines, and since line d is 
also a good line, we must encounter at least maxMeasure good lines. 

(2) There are at most (|GoodI4ertices| •41’”“®^"*!) different contents {v,S) where 

V is a good-vertex (counting only once contents that the same except for the 

measure). Thus, since maxMeasure = {\GoodVertices\ ■ by the 

pigeon-hole principle and by (1), there must be two such lines di and dj in any 
7. Since the measure is non-increasing, we must have Cd^ «C Cd,- □ 

We now use W to construct a strategy tree for player 0. Each node of the 
strategy tree will be labelled by a triple (s, d. Stack), where: 

— s is a global state (b, v) of Ba . 

— d is a line number in the witness straight-line program W. 

— Stack is a stack [(3j,j3j-i, . . . ,/3i], where each jd^ defines a mapping from a 
subset of the exits of a component to line numbers in W (in a consistent way 
to be defined). 

The root of the strategy tree will be labelled by ((m), I, []), (where I is the last line 
of W). Thereafter, if a node is labelled by ((6, v),d. Stack), we use W to construct 
its children. Suppose, for example, line d of hF is (d, (v, S), [di, d2 , . . . , di]), and 
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suppose V = (b,e) is a call. Let line di have content (e, S'!). Then we create 
one child for the node and label it by the tuple ({b,b,e),di,push(/3, Stack}), 
where /3 is a mapping from each exj, such that (exj,tvalj) is in ^i, to a line di. 
whose content is of the form {{b,exj), Sj). In other words, the element pushed 
on the Stack tells us where in the witness program to return to when returning 
from b on the call stack, as dictated by the predecessors d 2 , ■ ■ ■ ,di in line d. For 
other kinds of vertices v, we can construct corresponding children of nodes in 
the strategy tree. If in the strategy tree so constructed we reach a state (6, b, ex), 
whose vertex ex is an exit, then we pop Stack and use its contents to dictate 
what the children of that node should be labelled, using the state content to 
assign the line number of W to the triple associated with the children. 

Lemma 4. The construction outlined above yields a finite tree Tw, all of whose 
leaves are labelled by triples of form {{b,v),d, Stack), where line d is an initial 
line ofW with content (w, {(_L, 1)}). 

The lemma can be established using the structure of W . The key point is that if 
we ever reach an exit following the program W , we know that our Stack contains 
a return address where we may continue to build Tvy. We want to build from 
Tw an infinite strategy tree. 

Lemma 5. For any leaf z in Tw labelled by {{b,v),d, Stack), the root-to-leaf 
path in Tw must include a subsequence HmaxMeasure, ■ ■ ■ ,H\, where 
Hi = {{bi,Vi) ,di, Stacki) , and where 

— Vi is a good vertex of A, for each i, 

— each bi is a prefix of b, and every node on the path from Hi to z contains 
bi as a prefix of its call stack (and hence the Stack at every such node also 
contains as a prefix the stack Stacki at Hi). 

— Hi has (vi,Si) as the content of its line, such that (T,f) € Si. 

In other words, the subsequence HmaxMeasure, ■■■, Hi of nodes along the 
root to leaf path in Tw will witness the decrementation of the counter from 
max Measure down to 1. The counter can only be decremented by 1 at a 
time, and so all such distinct witnesses must exist. Now, because of the size 
of maxMeasure, there must be two distinct Hr and Hr', r' < r, with labels 
{{br, v),dr, Stackr) and {{hr' ,v) ,dr> , Stackr'), such that they both have the same 
vertex v (which must be a good vertex, either an accept node or a call), and 
such that the lines dr and dr> have content {v,S) and {v,S'), where S and S' 
are exactly the same except that (T,r) is in S while (T,r') is in S'. We mark 
any such node Hr'. We then eliminate the subtrees rooted at all marked nodes 
in Tw- This yields another finite tree T' . We will use T' to construct an infi- 
nite strategy tree T* that is accepting for player 0. Let M be the set of (good) 
vertices v such that a state (b,v) labels some leaf of T' . 

Lemma 6. For any leaf of T' labelled L = {{b,v),d. Stack), there is a finite 
strategy tree Tl for player 0, with root labelled L, such that the state labels on 
every root-to-leaf path contain b as a stack prefix, and such that all leaf labels 
{{b' ,v'),d' , Stack'), have v' € M. Moreover, every root-to-leaf path in T^ con- 
tains an accept state. 
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Using the lemma we can incrementally construct T* from T' . We repeatedly 
attaching a copy of to every leaf labelled L in the tree T'. Since the process 
always produces leaves labelled s whose node v is in M, and since the stacks can 
only grow, we can extend the tree indefinitely. Every path will contain infinitely 
many accept states, because each finite subtree that we attach contains an accept 
state on every root to leaf path. 

{<=): If Player 0 has a winning strategy in the Biichi game starting at (u), then 
{{J-, max Measure)} will eventually enter BSet{u). We omit the proof, which is 
similar to the proof that if player 0 has a winning strategy in the reachability 
game from (m), then {} will eventually enter RSet(u). □ 
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Abstract. Java is a very successful programming language which is also becom- 
ing widespread in embedded systems, where software correctness is critical. Jlint 
is a simple but highly efficient static analyzer that checks a Java program for several 
common errors, such as null pointer exceptions, and overflow errors. It also in- 
cludes checks for multi-threading problems, such as deadlocks and data races. The 
case study described here shows the effectiveness of Jlint in finding certain faults, 
including multi-threading problems. Analyzing the reasons for false positives in 
the multi-threading warnings gives an insight into design patterns commonly used 
in multi-threaded code. The results show that a few analysis techniques are suffi- 
cient to avoid almost all false positives. These techniques include investigating all 
possible callers and a few code idioms. Verifying the correct application of these 
patterns is still crucial, because their correct usage is not trivial. 



1 Introduction 

Java is becoming more widespread in the area of embedded systems, both as a scaled- 
down “Micro Edition” [20] or by having real-time extensions [6,5]. In such systems, 
software cannot always be replaced on a running system. Failures may have expensive 
or even catastrophic consequences. These costs are obviously prohibitively high when a 
software-related problem causes the failure of a space craft [14]. Therefore an automated 
tool which can detect faults easily, preferably early in the lifecycle of software, can be 
very useful. One tool that allows fault detection easily, even in incomplete systems, is 
Jlint. Among similar tools geared towards Java, it is one of the most suitable with respect 
to ease of use (no annotations required) and free availability (the tool is Open Source) 
[ 1 ]. 

1.1 The Java Programming Language 

Java is a modern, object-oriented programming language that has had a large success 
in the past few years. Source code is not compiled to machine code, but to a different 
form, the bytecode. This bytecode runs in a dedicated environment, the virtual machine. 
In order to guarantee the integrity of the system, each class file containing bytecode is 
checked prior to execution [1 1,19,21]. 

The Java language allows each object to have any number of fields, which are at- 
tributes of each object. These may be static, i.e., shared among all instances of a certain 
class, or dynamic, i.e., each instance has its own fields. In contrast to that, local variables 
are thread-local and only visible within one method. 
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Java allows inheritance: a method of a given class may be overridden by a method of 
the same name. Similarly, fields in a subclass shadow those with the same name in the 
superclass. In general, these mechanisms work well for small code examples but may be 
dangerous in larger projects. Methods overriding other methods must ensure they do not 
violate invariants of the superclass. Similar problems occur with variable shadowing. 
The programmer is not always aware that a variable with the same name already exists 
on a different level, such as the superclass. 

In order to prevent incorrect programs from corrupting the system, Java’s virtual 
machine has various safety mechanisms built in. Each variable access is guarded against 
manipulating memory outside the allocated area. In particular, pointers must not be 
null when dereferenced, and array indices must be in a valid range. If these properties 
are violated, an exception is thrown indicating a programming error. This is a highly 
undesirable behavior in most cases. Ideally, such errors should be prevented by static 
analysis, rather than caught at run-time. 

Furthermore, Java offers mechanisms to write multi-threaded programs. The two key 
mechanisms are locking primitives, using the synchronized keyword, and inter-thread 
synchronization with the wait and notify methods. Incorrect lock usage using too 
many locks may lead to deadlocks. For example, if two threads each wait on a lock held 
by the other thread, both threads cannot continue their execution. On the other hand, if 
a value is accessed with insufficient lock protection, data races may occur: two threads 
may access the same value concurrently, and the results of the operations are no longer 
deterministic. 

Java’s message passing mechanisms for threads also is a source of problems. A call to 
wait allows a thread to suspend until a condition becomes true, which must be signaled 
by notify by another thread. When calling wait the calling thread must ensure that it 
owns the lock it waits on, and also release any other locks before the call. Otherwise, 
remaining locks held are unavailable to other threads, which may in turn block when 
trying to obtain them. This can prevent them from calling notify which would allow 
the waiting thread to release its lock. This situation is also a deadlock. 

1.2 Related Work 

Much effort has gone into fault-finding in Java programs, single-threaded and multi- 
threaded. The approaches can be separated into static checkers, which check a program 
at compile-time and try to approximate its run-time behavior, and dynamic checkers, 
which try to catch and analyze anomalies during program execution. 

Several static analysis tools exist that examine a program for faults such as null 
pointer dereferences or data races. The ESC/Java [9] tool is, like Jlint, also based on 
static analysis, or more generally on theorem proving. It, however, requires annotation 
of the program. While it is more precise than Jlint, it is not nearly as fast and requires a 
large effort from the user to fully exploit the power of this tool [9]. 

Dynamic tools have the advantage of having more precise information available in 
the execution trace. The Eraser algorithm [22] , which has been implemented in the Visual 
Threads tool [12] to analyze C and C-H- programs, is such an algorithm that examines 
a program execution trace for locking patterns and variable accesses in order to predict 
potential data races. 
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The Java PathExplorer tool (JPaX) [16] performs deadlock analysis and the Eraser 
data race analysis on Java programs. It furthermore recently has been extended with the 
high-level data race detection algorithm described in [3]. This algorithm analyzes how 
collections of variables are accessed by multiple threads. 

More heavyweight dynamic approaches include model checking, which explores 
all possible schedules in a program. Recently, model checkers have been developed 
that apply directly to programs (instead of just models thereof). This includes the Java 
PathEinder system (JPF) developed by NASA [15,24], and similar systems [10,8,17,4, 
23]. Such systems, however, suffer from the state space explosion problem. In [13] we 
describe an extension of Java PathEinder which performs data race analysis (and deadlock 
analysis) in simulation mode, whereafter the model checker is used to demonstrate 
whether the data race (deadlock) warnings are real or not. 

This paper focuses on applying Jlint [2] to the software for detecting errors statically. 
Jlint uses static analysis and abstract interpretation to find difficult errors at compile- 
time. A similar case study with Jlint has been made before, applying it to large projects 
[2]. The difference to this case study is that the other case study had scalability in 
mind. Jlint had been applied to packages containing several hundred thousand lines 
of code, generating hundreds of warning messages. Because of this, the warnings had 
been evaluated selectively, omitting some hard-to-check deadlock warnings. In this case 
study, an effort was made to analyze every single warning and also see what kinds of 
design patterns cause false positives.' 

1.3 Outline 

This text is organized as follows: Section 2 describes Jlint and how it was used for this 
project. Sections 3 and 4 show the results of applying Jlint to space exploration program 
code. Design patterns which are common among these two projects are analyzed in 
Section 5. Section 6 summarizes the results and concludes. 



2 Jlint 

2.1 Tool Description 

Jlint checks Java code and finds bugs, inconsistencies and synchronization problems by 
performing a data flow analysis, abstract interpretation, and building the lock graph. It 
issues warnings about potential problems. These warnings do not imply that an actual 
error exists. This makes Jlint unsound as a program proven Moreover, Jlint can also 
miss errors, making it incomplete. The reason for this is that the goal was to make Jlint 
practical, scalable, and possible to implement it in a short time. 

Typical warnings about possible faults issued by Jlint are null pointer dereferences, 
array bounds overflows, and value overflows. The latter may occur if one multiplies two 
32 bit integer values without converting them to 64 bit first. 

* Design patterns commonly denote compositions of objects in software. In this paper, the notion 
of composition is different. It includes lock patterns and sometimes only applies to a small part 
of the program. In that context, we also use the term “code idiom”. 
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Many warnings that Jlint issues are code guidelines: A local variable should never 
have the same name as a field of the same class or a superclass. When a method of a 
given name is overridden, all its variants should be overridden, in order to guarantee a 
consistent behavior of the subclass. 

Jlint also includes many analyses for multi-threaded programs. Some of flint’s warn- 
ings for multi-threaded programs are overly cautious. For instance, possible data race 
warnings for method calls or variable accesses do not necessarily imply a data race. The 
reason for such false positives are both difficulties inherent to static analysis, such as 
pointer aliasing across method calls, and limitations in Jlint itself, where its algorithms 
could be refined with known techniques. 

Jlint works in two passes: a first pass, where all methods are analyzed in a modular 
way, and a second pass with the deadlock analysis. In the first pass, each method is 
analyzed with abstract interpretation. The abstraction used for numbers includes their 
maximal possible range and, for integers, bit masks that apply to them. For pointers, 
the abstraction records whether a pointer is possibly null or not, distinguishing the 
special this pointer, values loaded from fields, and new instance references created by 
the object constructor. The data flow analysis merges possible program states at branch 
targets but only executes loops once. 

Most properties, such as possible value overflows, are analyzed in this first pass. The 
lock graph is also built in this first pass and checked for deadlocks in a second pass. A 
possible refinement would be to defer some data race analyses to the second pass, where 
global information, such as if a field is read-only, can be made available. 



2.2 Warning Review Process 

Jlint gives fairly descriptive warnings for each problem found. The context given is 
limited to the class in which the error occurs, the line number, and fields used or methods 
called. This is always sufficient to find the source of simple warnings, which concern 
sequential properties such as null pointer dereferences. These warnings are easy to 
review and were considered in a first review phase. The other warnings, concerning 
multi-threading problems, take much more time to consider, and were evaluated in a 
second phase. 

The review process essentially checks whether the problems described in the warn- 
ings can actually occur at run-time. In simple cases, warnings may be ruled out given 
the algorithmic properties of the program. Complex cases include reviewing callers to 
the method in question. 

Data race and deadlock warnings fall in the latter category. They require constructing 
a part of the call graph including locks owned by callers when a method is called. If it 
can be ensured that all calls to non-synchronized, shared methods are made only through 
methods that already employ lock protection, then there cannot be a data race.^ 

This review process can be rather time-consuming. Many warnings occur in similar 
contexts, so warnings referring to the same problem can usually be easily confirmed as 
duplicates. This part of the review process was not yet automated in any way but could 

^ Methods that access a shared field are also considered “shared” in this context. The lock used 
for ensuring mutual exclusion must be the same lock for all calls. 
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be automated to a large extent with known techniques. Both cases studies were made 
without prior knowledge of the program code. It can be assumed that the time to review 
the warnings is shorter for the author of the code, especially when reviewing data race 
or deadlock warnings. 

During the review process, Hint’s warnings were categorized to see whether they refer 
to the same problem. Such situations constitute calls to the same method from different 
callers, the same variable used in different contexts, or the same design pattern applied 
throughout the class. In a separate count, counting the number of distinct problems rather 
than individual warnings, all such cases were counted once. Note that the review activity 
was often interrupted by other activities such as writing this paper. We believe this 
reduced the overall time required because manual code reviews require much attention, 
and cannot be carried out in one run without a degradation of the concentration required. 

3 First Case Study: Rover Code 

The first case study is a software module, called the Executive, for controlling the move- 
ment of the planetary wheeled rover K9, developed at NASA Ames Research Center. 
The run time for analyzing the code with Hint was 0.10 seconds on a PowerPC G4 with 
a clock frequency of 500 MHz. 

3.1 Description of the Rover Project 

K9 is a hardware platform for experimenting with rover technology for exploration 
of the Martian surface. The Executive is a software module for controlling the rover, 
and is essentially an interpreter of plans, where a plan is a special form of a program. 
Plans are constructed from high-level constructs, such as sequential composition and 
conditionals, but no while loops. The effect of while loops is achieved by assuming that 
plans are generated on the fly during rover operation as environment conditions change. 
The lowest level nodes of a plan are tasks to be directly executed by the rover hardware. A 
node in a plan can be further constrained by a set of conditions, which when failing during 
execution, cause the Executive to abort the execution of the subsequent sibling nodes, 
unless specified otherwise through options. Examples of conditions are pre-conditions 
and post-conditions, as well as invariants to be maintained during the execution of the 
node. The examined Executive consists of 7,300 lines of Java code. This code was 
extracted by a colleague from the original rover code, written in 35,000 lines of C-H-. 
The code is highly multi-threaded, and hence provides a risk for concurrency errors. 
The Java version of the code was extracted as part of a different project, the purpose of 
which was to compare various formal methods, such as model checking, static analysis, 
runtime analysis, and simple testing [7]. The code contained a number seeded of errors. 

3.2 Hint Evaluation 

Jlint issues 249 warnings when checking the Rover code. Table 1 summarizes flint’s 
output. The first two columns show each type of problem and how many warnings 
Jlint generated for them. The third, forth and hfth column show the result of the manual 
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Table 1. Jlint’s warnings for the Rover code. 



Type 


Warnings 


Problems 

found 


Correct 

warnings 


False 

positives 


Time 

[min.] 


null pointer 


5 


1 


4 


1 


10 


Integer overflow 


2 


2 


2 


0 


5 


equals overridden but not hashCode 


2 


1 


2 


0 


1 


String comparison as reference 


1 


0 


0 


1 


1 


Total: Sequential errors 


10 


4 


8 


2 


17 


Incorrect wait/notify usage 


21 


5 


5 


16 


26 


Data race, method call 


157 


5 


18 


139 


112 


Data race, field access 


31 


0 


0 


31 


43 


Deadlock 


30 


7 


20 


10 


36 


Total: Multi-threading errors 


239 


17 


43 


196 


217 


Total 


249 


21 


51 


198 


234 



source code analysis: how many actual, distinct faults, or at least serious problems, in the 
code were found, how many warnings described such actual faults, and how many were 
considered to be false positives. The last column shows the time spent on code review. 
In the first phase, focusing on sequential properties, ten warnings were reviewed, while 
the second phase had 239 warnings to be reviewed. 



Sequential errors: Among the problems found are two integer overflows, where two 
32-bit integers were multiplied to produce a 64 bit result. However, integer conversion 
took place after the 32 bit multiplication, where an overflow may occur. 

Two other warnings referred to one problem, where equals was overridden, but not 
hashCode. This is dangerous because the modified equals method may return true for 
comparing two objects even though their hashCode differs, which is forbidden [21]. 

A noteworthy false positive concerned two strings that were compared as references. 
This was correct in that context because one of the strings was always known to be null. 



Multi-threading errors: The number of deadlock and data race warnings given by 
Hint was almost prohibitive. Yet, for answering the question why the false positives 
were generated, all warnings were investigated. All warnings were relatively easy to 
analyze. In most cases, possible callers were within the same class. Only for the most 
complex class, the call graph was large, making analysis more difficult.^ 

A surprisingly high number of multi-threading warnings were of type “Method 
’ <this> . wait I notify I notifyAll ’ is called without synchronizing on ’ <this> ’ .” 
After discounting dead code and false positives, one scenario remained: A lock was ob- 
tained conditionally, although it should be obtained in all cases, as required by the Java 
semantics for wait and notify. In the Rover code, this reflects a global switch in the 
original C-H- program that would allow testing the program without locking, eliminating 

^ The portion of the call graph to be investigated for this was up to eight methods deep. 
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possible deadlocks at the cost of introducing data races. Java does not allow this, so the 
Java version of the program always needs to be run with locking enabled. 

All data race warnings about shared field accesses were false positives. Reasons for 
false positives include the use of thread-local copies [18] or a thread-safe container class. 
In one case, only one thread instance that could access the shared field is ever generated. 
Evaluating data races for method calls was even more difficult and time-consuming. The 
errors found referred to cases where a read-only pattern, was broken by certain methods, 
creating potential data races. Because of their high number, the distribution of method 
data race warnings is noteworthy. A few classes which embody parallelized algorithms 
incurred the largest number of warnings, which were also the hardest to review. Classes 
encapsulating data are usually much simpler. Because some of these were heavily used 
in the program, a few of them were also responsible for a large number of warnings. 
However, these warnings were usually much easier to review. 

The 30 deadlock warnings all referred to the same two classes. There were two 
sets of warnings, the first set containing ten, the second one 20 warnings. The first ten 
warnings, all of them false positives, showed incomplete synchronization loops in the 
lock graph. The next 20 warnings, referring to seven methods, showed the same ten 
warnings with another edge in the lock graph, from the callee class back to the caller. 
Such a synchronization loop includes two sequences of different lock acquisitions in 
reverse order. This makes a cyclic deadlock possible. Therefore these warnings referred 
to actual faults in the code. 



Results: In only 15 minutes, four faults could be found by looking at the ten warnings 
referring to sequential properties. While reviewing the multi-threading warnings was 
time-consuming due to the complex interactions in the code, it helped to highlight the 
critical parts of the code. The effort was justifiable for a project of this complexity. 

3.3 Comparison to Other Projects 

In an internal case study at NASA Ames [7], several other tools were applied to the Rover 
code base, detecting 38 errors. Among these errors were 18 seeded faults. Interestingly, 
most of these errors found were not those detected by Jlint. Almost all the seeded bugs 
concerned algorithmic problems or hard-to-find deadlocks, which Jlint was not capable 
of finding. However, Jlint in turn detected a lot of faults which were not found by any 
other tool. Table 2 compares Jlint to the other case studies. In that table, missed faults 
include both sequential and multi-threading properties. 

The eleven new bugs found by Jlint were a great success, even considering that the 
seven deadlocks correspond to two classes where other deadlocks have been known to 
occur. However, Jlint reported different methods than those reported in other analyses. 

4 DSl 

The second case study consisted of an attitude control system and a fault protection 
system for the Deep Space 1 (DSl) space craft. It took 0.17 seconds to check the entire 
code base on the same PowerPC G4 with a clock frequency of 500 MHz. 
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Table 2. Comparison of errors found by Jlint and by other tools. 



Error type 


# 


Evaluation 


Seeded faults 


l8 


Not found by Jlint 


Non-seeded faults, other than overflow 


l8 


Not found by Jlint 


Integer overflow 


2 


Found by both case studies 


null pointer 




New (i.e., only found by Jlint) 


equals overridden but not hashCode 




Translation artifact (not occurring in the C version) 


Incorrect wait/notify usage 




Debugging artifact (not executable in Java) 


Data races 


5 


3 new, 2 dead code (unused methods) 


Deadlocks 




new (two classes known to be faulty involved) 



4.1 Description of DSl 

DSl was a technology-testing mission, which was launched October 24 1998, and which 
ended its primary mission in September 1999. DS 1 contained and tested twelve new kinds 
of space-travel technologies, for example, ion propulsion and artificial intelligence for 
autonomous control. DS 1 also contained more standard technologies, such as an attitude- 
control system and a fault-protection system, coded in C. The attitude-control system 
monitors and controls the space craft’s attitude, that is, its position in 3-dimensional 
space. The attitude is controlled by small thrusters, which can be pointed, and fired, 
in different directions. The fault-protection system monitors the operation of the space 
craft and initiates corrective actions in case errors occur. The code examined in this case 
study is an 8,700-line Java version of the attitude-control system and fault-protection 
system, created in order to examine the potential for programming flight software in Java, 
as described in [5]. That effort consisted in particular of experimenting with the real- 
time specification for Java [6]. The original C code was re-designed in Java, using best 
practices in object-oriented design. The Java version used design patterns extensively, 
and put an emphasis on pluggable technology, relying on interfaces. 



4.2 Jlint Evaluation 

Sequential errors: Again, a first evaluation of flint’s warnings included only the se- 
quential cases. Table 3 shows an overview. Eleven warnings referred to name clashes 
in variable names, a large risk of future programming errors. False positives resulted 
from either dead code, a code idiom that was poor choice but acceptable in that case, 
and compiler artifacts introduced by inner classes. Three warnings reported problems 
with overridden methods, where several versions of a method with the same name but 
different parameter lists (“signatures”) were only partially overridden. This must be 
avoided because inconsistencies among the overridden and inherited variants are almost 
inevitable. 



Multi-threading errors: In the second phase, the 47 multi-threading warnings were 
investigated. Most of them were false positives: Warnings about run methods which are 





Applying Hint to Space Exploration Software 



305 



Table 3. flint’s warnings for the DSl code. 



Type 


Warnings 


Problems 


Correct 


False 


Time 






found 


warnings 


positives 


[min.] 


Local variable shadows field 


4 


2 


2 


2 


2 


Component shadows base class 


7 


0 


0 


7 


3 


Incomplete method overriding 


3 


3 


3 


0 


3 


equals overridden but not hashCode 


1 


0 


0 


1 


1 


Total: Sequential errors 


15 


5 


5 


10 


9 


Incorrect wait/notify usage 


7 


0 


0 


7 


3 


run method not synchronized 


5 


0 


0 


5 


0 


Overriding synchronized methods 


3 


0 


0 


3 


2 


Data race, field access 


1 


0 


0 


1 


7 


Data race, method call 


20 


1 


6 


14 


38 


Deadlock 


11 


0 


0 


11 


20 


Total: Multi-threading errors 


47 


1 


6 


41 


70 


Total 


62 


6 


11 


51 


79 



not synchronized are overly conservative. Warnings about wait/notify were caused 
by the unsoundness of flint’s data flow analysis. False positives for data race warnings 
were mostly caused by the fact that Jlint does not analyze all callers when checking 
methods for thread safety. If all callers synchronize on the same lock, a seemingly 
unsafe method becomes safe. Other reasons for false positives were the use of thread-safe 
container classes in such methods, the use of read-only fields, and per-thread conhnement 
[18], which always creates a new instance as return value. 

The six warnings indicating an error concerned calls to a logger method. In the logger 
method, there were indeed data races, even though they may not he considered to be 
crucial: The output of different formatting elements of different entries to be logged may 
be interleaved. 

Again, as in all non-trivial examples, deadlock warnings are almost impossible to 
investigate in detail without a call graph browsing tool. Nevertheless, an effort was made. 
After 12 minutes, it was found that the first deadlock warning was a false alarm due to the 
lack of context sensitivity in Hint’s call graph analysis. After this, most warnings could 
he dismissed as duplicates of the first one. In the two remaining cases. Hint’s warnings 
did not give the full loop context, so they could not be used. 



Results: Most sequential warnings could be evaluated very quickly. The problems 
found were code convention violations, which would not necessarily cause run-time 
errors. However, they are easy to hx and should he addressed. Reviewing the data race 
warnings was relatively simple, although it would have been much easier with a call 
graph visualization tool. Most false positives could have been prevented by a more 
complete call graph analysis or recognizing a few simple design patterns. 
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Table 4. Design patterns for avoiding data races in seemingly unsafe methods. 



Code base 


Rover 


DSl 


Total 


Problem 


wait/ 


Data race 


Data race 


Data race 


Data race 




category 


notify 


(field) 


(method) 


(field) 


(method) 




Read-only fields 


- 


9 


19 


- 


3 


31 


Synchronization for all callers 


12 


- 


7 


1 


2 


22 


Return copy of data 


- 


- 


3 


- 


1 


4 


Thread-local copy 
during operation 




1 


1 






2 


Thread-safe container 


- 


- 


1 


- 


2 


3 


One thread instance 


- 


1 


- 


- 


- 


1 


Total 


12 


11 


31 


1 


8 


63 



5 Design Patterns in Multi-threaded Software 

Sections 3 and 4 have shown that sequential properties are easy to evaluate with the aid 
of a static analysis tool. This is not the case with multi-threading problems. There are two 
ways to improve the situation: Make the evaluation of warnings easier using visualization 
tools, or improve the quality of the analysis itself, reducing the false positives. We focused 
on the latter aspect. When analyzing the warnings, it soon became apparent that only 
a few common code idioms were behind the problems. The remainder of this paper 
investigates what patterns are used to avoid multi-threading problems. 

Table 4 shows an overview of the different design patterns used in the code of the 
two space exploration projects to avoid conflicts with unprotected fields or methods. 
The counts correspond to the applications of these patterns, all of which result in one 
or more spurious warnings when analyzed with flint. When using these patterns, there 
appears to be a data race, if a method is considered in isolation or without considering 
thread ownership. There is no data race when considering the entire program. 

The most common idiom used to prevent data races was the use of read-only values. 
Read-only fields are usually declared f inal and not changed after initialization. Because 
this declaration discipline is not always followed strictly, recognizing it statically is not 
always trivial, but nevertheless feasible by checking all uses of a given field in the entire 
code. Ensuring global thread-safety in such cases is of course only possible in the absence 
of dynamic class loading. Other design patterns include: 

- Ensuring mutual exclusion in an unsafe method by having all callers of that method 
acquire a common lock. Such callers work as a thread-safe wrapper around unsyn- 
chronized parts of the code. 

- The usage of (deep) copies of data returned by a method ensures that the “working 
copy” used subsequently by the caller remains thread-local [18]. This eliminates the 
need for synchronization in the caller. 

- Copying method parameters restricts data ownership to the called method and the 
current thread [18]. The callee then does not have to be synchronized, but it is not 
allowed to use any shared data other than the copied parameters supplied by the 
caller. Doing otherwise would again require synchronization. 
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- Legacy container data structures such as Vector are inherently thread-safe because 
they internally use synchronization [21]. 

- Finally, if there exists only one thread instance of a particular class, no data races 
can occur if that thread is the only type that calls a certain method. 

Two cases of false positives were not included in this summary: unused meth- 
ods (dead code) and conditional locking based on a global flag used for debugging 
wait/notify locking (which was permissible in the original C-H- Rover code but not 
in the Java version). 

This study indicates that four design patterns prevail in cases where code is apparently 
not thread-safe: Synchronization of all callers, use of read-only values, thread-local 
copies of data, and the use of thread-safe container classes. Although simple patterns 
prevail, their usage is not always trivial: Some of the data race warnings for the Rover 
code pointed out cases where it was attempted to use the read-only pattern, but the use 
was not carried out consistently throughout the project. Such a small mistake violates 
the property that guarantees thread-safety. This discussion so far concerned only data 
race warnings. No prevailing pattern has been found in the case of deadlocks, where the 
programmer has to ensure no cyclic lock dependency arises between threads. 



6 Conclusions 

Space exploration software is complex. The high costs incurred by potential software 
failures make the application of fault-finding tools very fruitful. Jlint was very successful 
as such a tool in both case studies, complementing the strengths of other tools. In each 
project, the study found four or five significanf problems wifhin only 15 minutes of 
evaluating Jlinf’s warnings. The mulli-threading warnings were more difficult and time- 
consuming to evaluate but still effective at pointing out critical parts in the code. 

An analysis of the false positives showed that in apparently thread-unsafe code, four 
common design patterns ensure thread-safety in all cases. Static analysis tools should 
therefore be extended with specific algorithms geared towards these patterns to reduce 
false positives. Furthermore, these patterns were not always applied correctly and are 
still a significant source of programming errors. This calls for tools that verify the correct 
application of these patterns, thereby pointing out even more subtle errors than previously 
possible. 



References 

1. C. Artho. Finding Faults in Multi-Threaded Programs. Master’s thesis, ETH Ziirich, 2001. 

2. C. Artho and A. Biere. Applying Static Analysis to Large-Scale, Multi-threaded Java Pro- 
grams. In D. Grant, editor, Proceedings of the ISthASWEC, pages 68-75. IEEE CS Press, 
2001 . 

3. C. Artho, K. Havelund, and A. Biere. High-Level Data Races. In VVEIS’03, April 2003. 

4. T. Ball, A. Podelski, and S. Rajamani. Boolean and Cartesian Abstractions for Model Checking 
C Programs. In Proc. TACAS’Ol: Tools and Algorithms for the Construction and Analysis of 
Systems, LNCS, Italy, 2001. 




308 



C. Artho and K. Havelund 



5. E. G. Benowitz and A. F. Niessner. Java for Flight Software. In Space Mission Challenges 
for Information Technology, July 2003. 

6. G. Bollella, J. Gosling, B. Brosgol, P. Dibble, S. Furr, and M. Turnbull. The Real-Time 
Specification for Java. Addison- Wesley, 2000. 

7. G. Brat, D. Giannakopoulou, A. Goldberg, K. Havelund, M. Lowry, C. Pasareanu, A. Venet, 
and W. Visser. Experimental Evaluation of Verification and Validation Tools on Martian Rover 
Software. In SEI Software Model Checking Workshop, 2003. Extended abstract. 

8. I. Corbett, M. B. Dwyer, I. Hatcliff, C. S. Pasareanu, Robby, S. Laubach, and H. Zheng. 
Bandera: Extracting Finite-state Models from Java Source Code. In Proc. 22nd International 
Conference on Software Engineering, Ireland, 2000. ACM Press. 

9. D. L. Detlefs, K. Rustan, M. Leino, G. Nelson, and J. B. Saxe. Extended Static Checking. 
Technical Report 159, Compaq Systems Research Center, Palo Alto, California, USA, 1998. 

10. P. Godefroid. Model Checking for Programming Languages using VeriSoft. In Proc. 24th 
ACM Symposium on Principles of Programming Languages, pages 174—186, Prance, 1997. 

11. J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Virtual Language Specification, Second 
Edition. Addison Wesley, 2000. 

12. J. Harrow. Runtime Checking of Multithreaded Applications with Visual Threads. In 7th 
SPIN Workshop, volume 1885 of LNCS, pages 331-342. Springer, 2000. 

13. K. Havelund. Using Runtime Analysis to Guide Model Checking of Java Programs. In 7th 
SPIN Workshop, volume 1885 of LNCS, pages 245-264. Springer, 2000. 

14. K. Havelund, M. Lowry, S. Park, C. Pecheur, J. Penix, W. Visser, and J. White. Eormal 
Analysis of the Remote Agent Before and After Flight. In 5th NASA Langley Normal Methods 
Workshop, June 2000. USA. 

15. K. Havelund and T. Pressburger. Model Checking Java Programs using Java PathFinder. 
International Journal on Software Tools for Technology Transfer, 2(4):366-381, 2000. 

16. K. Havelund and G. Ro§u. Monitoring Java Programs with Java PathExplorer. In K. Havelund 
and G. Ro§u, editors. Runtime Verification (RV’Ol), volume 55 of ENTCS. Elsevier Science, 
2001 . 

17. G. Holzmann and M. Smith. A Practical Method for Verifying Event-Driven Software. In 
Proc. ICSE’99, International Conference on Software Engineering, USA, 1999. IEEE/ ACM. 

18. D. Lea. Concurrent Programming in Java, Second Edition. Addison Wesley, 1999. 

19. T. Lindholm and A. Yellin. The Java Virtual Machine Specification, Second Edition. Addison 
Wesley, 1999. 

20. Sun Microsystems. Connected, limited device configuration, specification version 1.0a, may 
2000. http : //java. sun. com/j2me/docs/. 

21. Sun Microsystems. Java 2 documentation, http://java.sun.eom/j2se/l.4/docs/. 

22. S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A Dynamic 
Data Race Detector for Multithreaded Programs. ACM Transactions on Computer Systems, 
15(4):391-411, 1997. 

23. S. D. Stoher. Model-Checking Multi-threaded Distributed Java Programs. In 7th SPIN 
Workshop, volume 1885 of LNCS, pages 224—244. Springer, 2000. 

24. W. Visser, K. Havelund, G. Brat, and S. Park. Model Checking Programs. In Proc. ASE’2000: 
The 1 5 th IEEE International Conference on Automated Software Engineering. IEEE CS Press, 
2000 . 




Why AI + ILP Is Good for WCET, but MC Is 
Not, Nor ILP Alone 



Reinhard Wilhelm 



Informatik 

Universitat des Saarlandes 
D - 66123 Saarbriicken 



Abstract. A combination of Abstract Interpretation (AI) with Integer 
Linear Programming (ILP) has been successfully used to determine pre- 
cise upper bounds on the execution times of real-time programs, com- 
monly called worst-case execution times (WCET). The task solved by 
abstract interpretation is to verify as many local safety properties as 
possible, safety properties who correspond to the absence of “timing ac- 
cidents”. Timing accidents, e.g. cache misses, are reasons for the increase 
of the execution time of an individual instruction in an execution state. 
This article attempts to give the answer to the frequently encountered 
claim, “one could have done it by Model Checking (MC)!”. It shows that 
it is the characteristic property of abstract interpretation, which proves 
AI to be applicable and successful, namely that it only needs one fixpoint 
iteration to compute invariants that allow the derivation of many safety 
properties. MC seems to encounter an exponential state-space explosion 
when faced with the same problem. ILP alone has also been used to 
model a processor architecture and a program whose upper bounds for 
execution times was to be determined. It is argued why the only ILP-only 
approach found in the literature has not led to success. 



1 Introduction 

Hard real-time systems are subject to stringent timing constraints which are 
dictated by the surrounding physical environment. A schedulability analysis has 
to be performed in order to guarantee that all timing constraints will be met 
( “timing validation” ) . Existing techniques for schedulability analysis require up- 
per bounds for the execution times of all the system’s tasks to be known. These 
upper bounds are commonly called the worst-case execution times (WCETs) and 
we will stick to this doubtful naming convention. In analogy, lower bounds on 
the execution time have been named best-case execution times, (BCET). WCETs 
(and BCETs) have to be safe, i.e., they must never underestimate (overestimate) 
the real execution time. Furthermore, they should be tight, i.e., the overestima- 
tion should be as small as possible. 

Note, that customers of WCET tools want to see how far the WCET is 
away from the deadline. They are not content with a YES/NO answer, “your 
program will always terminate (cannot be guaranteed to terminate) within the 
given deadline”. Often, customers even want to see BCETs to see how big the 
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interval is. This gives them a feeling for the quality of the approximation. Also, 
any schedulability analysis uses WCETs to exploit the difference between the 
deadlines and the WCETs as leeway in the scheduling process. 

For processors with fixed execution times for each instruction methods to 
compute sharp WCETs [18,17] have long been established. However, in mod- 
ern microprocessor architectures caches, pipelines, and all kinds of speculation 
are key features for improving performance. Caches are used to bridge the gap 
between processor speed and the access time of main memory. Pipelines en- 
able acceleration by overlapping the executions of different instructions. The 
consequence is that the execution time of individual instructions, and thus the 
contribution of one execution of an instruction to the program’s execution time 
can vary widely. The interval of execution times for one instruction is bounded 
by the execution times of the following two cases: 

— The instruction goes “smoothly” through the pipeline; all loads hit the cache, 
no pipeline hazard happens, i.e., all operands are ready, no resource conflicts 
with other currently executing instructions exist. 

— “Everything goes wrong”, i.e., instruction and operand fetches miss the 
cache, resources needed by the instruction are occupied, etc. 




Fig. 1. The architecture of the aiT WCET tool. 

We will call any increase in execution time during an individual instruction’s 
execution a timing accident and the amount of increase the timing penalty of 
this accident. Our experience shows that added timing penalties can be in the 
order of several hundred processor cycles. The execution time of an instruction 
depends on the execution state, e.g., the contents of the cache(s), the occupancy 
of other resources, and thus on the execution history. It is therefore obvious that 
the execution time cannot be determined in isolation from the execution history. 

A more or less standard generic architecture for WCET tools has emerged [8, 
4,22]. Fig. 1 shows one instance of this architecture. A first pass, depicted on 
the left, predicts the behaviour of processor components for the instructions of 
the program. This allows to derive safe upper bounds for the execution times of 




Why AI + ILP Is Good for WCET, but MC Is Not, Nor ILP Alone 



311 



basic blocks. A second pass, the column on the right, computes an upper bound 
on the execution times over all possible paths of the program. 

A combination of Abstract Interpretation-for the first pass-and Integer Lin- 
ear Programming-for the second pass-has been successfully used to determine 
precise upper bounds on the execution times of real-time programs running on 
processors used in embedded systems [24,9,5,6,1]. A commercially available tool, 
aiT by Absint, cf. http://www.absint.de/wcet.htm, was implemented and is 
used in the aeronautics and automotive industries. 

The contribution of an individual instruction to the total execution time 
of a program may vary widely depending on the execution history. There are 
in general too many possible execution histories as to regard them exhaustively. 
Different execution histories may be summarized to contexts, if their influence on 
the behaviour of instructions or sequences of instructions does not significantly 
differ. Experience has shown [24], that the right differentiation of control-flow 
contexts is decisive for the precision of the prediction [15]. Contexts are asso- 
ciated with basic blocks, i.e., maximally long straight-line code sequences that 
can be only entered at the first instruction and left at the last. They indicate 
through which sequence of function calls and loop iterations control arrived at 
the basic block. 

2 Problem Statement 

First the problem statement: Given are 

— an (abstract) processor model. This model needs to be conservative with 
respect to the timing properties of the processor, i.e., any duration of a 
program execution derived from the abstract model must not be shorter than 
the duration of the program execution on the real processor. For this reason, 
we call this abstract processor model the timing model of the processor. 
Different degrees of abstraction are possible. The more abstract the timing 
model is, the more efficient could the analysis be. 

— a fully linked executable program. This is the program whose WCET should 
be determined. It is annotated with all necessary information to make WCET 
determination possible, e.g. loop iteration counts, maximal recursion depths 
etc. 

— possibly also a deadline, within which the program should terminate. For- 
mally, it makes a difference for model-checking approaches, whether we have 
the deadline or not. In practice, it does not make a difference. 

A solution consists in an upper bound for the execution times of all executions 
of the program 



3 WCET Determination by an AI + ILP Approach 

3.1 Modularization 

Methods based on AI and ILP are realized in the standard architecture men- 
tioned in the Introduction. They have (at least) two phases: 
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1. Predict safely the processor behaviour for the given program. This is done by 
computing invariants about the processor state at each program point, such 
as the set of memory blocks that surely are in the cache every time execution 
reaches this program point, as well as the contents of prefetch queues and 
free resources for the corresponding instructions. This information allows 
the derivation of safety properties, namely the absence of timing accidents. 
Examples of such safety properties are instruction fetch will not cause a 
cache miss; the load pipeline stage will not encounter a data hazard. 

Any such safety property allows to reduce the estimated execution time of 
an instruction by the corresponding timing penalty. Actually, for efficiency 
reasons upper bounds are calculated for basic blocks. 

2. Determine one path through the program, on which the upper bound for 
the execution times is computed. This is done by mapping the control flow 
of the program to an integer linear program maximizing execution time for 
paths through the program and solving this ILP [21,23]. 

3.2 Cache-Behaviour Prediction 

How Abstract Interpretation is used to compute invariants about cache con- 
tents is described next. How the behavior of programs on processor pipelines is 
predicted can be found elsewhere [10,20,25]. 

Cache Memories. A cache can be characterized by three major parameters: 

— capacity is the number of bytes it may contain. 

— line size (also called block size) is the number of contiguous bytes that are 
transferred from memory on a cache miss. The cache can hold at most n = 
capacity / line size blocks. 

— associativity is the number of cache locations where a particular block may 
reside. 

n / associativity is the number of sets of a cache. 

If a block can reside in any cache location, then the cache is called fully associa- 
tive. If a block can reside in exactly one location, then it is called direct mapped. 
If a block can reside in exactly A locations, then the cache is called A-way set 
associative. The fully associative and the direct mapped caches are special cases 
of the A-way set associative cache where A = n and A = 1, resp. 

In the case of an associative cache, a cache line has to be selected for re- 
placement when the cache is full and the processor requests further data. This 
is done according to a replacement strategy. Common strategies are LRU (Least 
Recently Used), FIFO (First In First Out), and random. 

The set where a memory block may reside in the cache is uniquely determined 
by the address of the memory block, i.e., the behavior of the sets is independent 
of each other. The behavior of an A-way set associative cache is completely 
described by the behavior of its n/A fully associative sets. This holds also for 
direct mapped caches where A = 1. 

For the sake of space, we restrict our description to the semantics of fully 
associative caches with LRU replacement strategy. More complete descriptions 
that explicitly describe direct mapped and A-way set associative caches can be 
found in [7,6]. 
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Cache Semantics. In the following, we consider a (fully associative) cache as 
a set of cache lines L = and the store as a set of memory blocks 

S , Syyi}. 

To indicate the absence of any memory block in a cache line, we introduce a new 
element /; <S" = S' U {/}. 

Definition 1 (concrete cache state). 

A (concrete) cache state is a function c : L ^ S' . 

Cc denotes the set of all concrete cache states. The initial cache state c/ maps 
all cache lines to I. 

If c{li) = Sy for a concrete cache state c, then i is the relative age of the 
memory block according to the LRU replacement strategy and not necessarily 
the physical position in the cache hardware. 

The update function describes the effect on the cache of referencing a block 
in memory. The referenced memory block Sx moves into /i if it was in the cache 
already. All memory blocks in the cache that had been more recently used than 
Sx increase their relative age by one, i.e., they are shifted by one position to the 
next cache line. If the referenced memory block was not yet in the cache, it is 
loaded into li after all memory blocks in the cache have been shifted and the 
‘oldest’, i.e., least recently used memory block, has been removed from the cache 
if the cache was full. 

Definition 2 (cache update). A cache update function Id : Cc x S ^ Cc 
determines the new cache state for a given cache state and a referenced memory 
block. 

Updates of fully associative caches with LRU replacement strategy are pictured 
as in Figure 2. 




"young" 

I Age 

"old" 



[s] 

Fig. 2. Update of a concrete fully associative (sub-) cache. 



Control Flow Representation. We represent programs by control flow graphs 
consisting of nodes and typed edges. The nodes represent basic blocks. A basic 
block is a sequence (of fragments) of instructions in which control flow enters 
at the beginning and leaves at the end without halt or possibility of branch- 
ing except at the end. For cache analysis, it is most convenient to have one 
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memory reference per control flow node. Therefore, our nodes may represent 
the different fragments of machine instructions that access memory. For non- 
precisely determined addresses of data references, one can use a set of possibly 
referenced memory blocks. We assume that for each basic block, the sequence of 
references to memory is known (This is appropriate for instruction caches and 
can be too restricted for data caches and combined caches. See [7,1] for weaker 
restrictions.), i.e., there exists a mapping from control flow nodes to sequences 
of memory blocks: £: V ^ S* . 

We can describe the effect of such a sequence on a cache with the help of the 
update function U. Therefore, we extend U to sequences of memory references 
by sequential composition: U{c, = U{... {U{c, s^^)) . . . , 

The cache state for a path (fci, . . . , kp) in the control flow graph is given by 
applying U to the initial cache state c/ and the concatenation of all sequences 
of memory references along the path: U{ci,C{ki) C{kp)). 

The Collecting Semantics of a program gathers at each program point the set 
of all execution states, which the program may encounter at this point during 
some execution. A semantics on which to base a cache analysis has to model cache 
contents as part of the execution state. One could thus compute the collecting 
semantics and project the execution states onto their cache components to obtain 
the set of all possible cache contents for a given program point. However, the 
collecting semantics is in general not computable. 

Instead, one restricts the standard semantics to only those program con- 
structs, which involve the cache, i.e., memory references. Only they have an effect 
on the cache modelled by the cache update function, U. This coarser semantics 
may execute program paths which are not executable in the start semantics. 
Therefore, the Collecting Cache Semantics of a program computes a superset of 
the set of all concrete cache states occurring at each program point. 

Definition 3 (Collecting Cache Semantics). 

The Collecting Cache Semantics of a program is 

Ccoii{p) = {hl{ci,L{ki) L{kn)) I (fci, . . . , kn) path in the CFC leading to p} 

This collecting semantics would be computable, although often of enormous 
size. Therefore, another step abstracts it into a compact representation, so called 
abstract cache states. Note that every information drawn from the abstract cache 
states allows to safely deduce information about sets of concrete cache states, i.e., 
only precision may be reduced in this two step process. Correctness is guaranteed. 

Abstract Semantics. The specification of a program analysis consists of the spec- 
ification of an abstract domain and of the abstract semantic functions, mostly 
called transfer functions. The least upper bound operator of the domain com- 
bines information when control flow merges. 

We present two analyses. The must analysis determines a set of memory 
blocks that are in the cache at a given program point whenever execution reaches 
this point. The may analysis determines all memory blocks that may be in the 
cache at a given program point. The latter analysis is used to determine the 
absence of a memory block in the cache. 
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Table 1. Categorizations of memory references and memory blocks. 



Category 


Abb. 


Meaning 


always hit 


ah 


The memory reference will always result in a cache hit. 


always miss 


am 


The memory reference will always result in a cache miss. 


not classified 


nc 


The memory reference could neither be classified as ah nor am. 



The analyses are used to compute a categorization for each memory reference 
describing its cache behavior. The categories are described in Table 1. 

The domains for our abstract interpretations consist of abstract cache states: 

Definition 4 (abstract cache state). An abstract cache state c : L ^ 2 ^ 
maps cache lines to sets of memory blocks. C denotes the set of all abstract 
cache states. 

The position of a line in an abstract cache will, as in the case of concrete 
caches, denote the relative age of the corresponding memory blocks. Note, how- 
ever, that the domains of abstract cache states will have different partial orders 
and that the interpretation of abstract cache states will be different in the dif- 
ferent analyses. 

The following functions relate concrete and abstract domains. An extrac- 
tion function, extr, maps a concrete cache state to an abstract cache state. The 
abstraction function, abstr, maps sets of concrete cache states to their best repre- 
sentation in the domain of abstract cache states. It is induced by the extraction 
function. The concretization function, concr, maps an abstract cache state to the 
set of all concrete cache states represented by it. It allows to interpret abstract 
cache states. It is often induced by the abstraction function, cf. [16]. 

Definition 5 (extraction, abstraction, concretization functions). The 

extraction function extr : Cc ^ C forms singleton sets from the images of the 
concrete cache states it is applied to, i.e., extr{c){li) = {sa,} if c{li) = Sx- 
The abstraction function abstr : 2^“ C is defined by 

abstr{C) = | |{ea;tr(c) \ c G C} 

The concretization function concr : C — >■ 2*^“ is defined by 
concr(c) = {c \ extr{c) G c} 

So much of commonalities of all the domains to be designed. Note, that all the 
constructions are parameterized in U and C. 

The transfer functions, the abstract cache update functions, all denoted U, 
will describe the effects of a control flow node on an element of the abstract 
domain. They will be composed of two parts, 

1. “refreshing” the accessed memory block, i.e., inserting it into the youngest 
cache line, 

2. “aging” some or all other memory blocks already in the abstract cache. 
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Termination of the analyses. There are only a finite number of cache lines and for 
each program a finite number of memory blocks. This means, that the domain 
of abstract cache states c : L — l 2'^ is finite. Hence, every ascending chain is 
finite. Additionally, the abstract cache update functions, U, are monotonic. This 
guarantees that all the analyses will terminate. 

Must Analysis. As explained above, the must analysis determines a set of mem- 
ory blocks that are in the cache at a given program point whenever execution 
reaches this point. Good information is the knowledge that a memory block is 
in this set. The bigger the set, the better. As we will see, additional information 
will even tell how long it will at least stay in the cache. This is connected to the 
“age” of a memory block. Therefore, the partial order on the must-domain is as 
follows. Take an abstract cache state c. Above c in the domain, i.e., less precise, 
are states where memory blocks from c are either missing or are older than in c. 
Therefore, the U-operator applied to two abstract cache states ci and C 2 will pro- 
duce a state c containing only those memory blocks contained in both and will 
give them the maximum of their ages in ci and £2 (see Figure 4). The positions 
of the memory blocks in the abstract cache state are thus the upper bounds of 
the ages of the memory blocks in the concrete caches occurring in the collecting 
cache semantics. Concretization of an abstract cache state, c, produces the set 
of all concrete cache states, which contain all the memory blocks contained in c 
with ages not older than in c. Cache lines not filled by these are filled with other 
memory blocks. 

We use the abstract cache update function depicted in Figure 3. 




[s] 

Fig. 3. Update of an abstract fully associative (sub-) cache. 




"intersection 
+ maximal age" 



Fig. 4. Combination for must analysis 
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The solution of the must analysis problem is interpreted as follows: Let c 
be an abstract cache state at some program point. If Sx € c(li) for a cache 
line li then Sx will definitely be in the cache whenever execution reaches this 
program point. A reference to Sx is categorized as always hit (ah). There is even 
a stronger interpretation of the fact that Sx G c(li). Sx will stay in the cache at 
least for the next n — i references to memory blocks that are not in the cache 
or are older than the memory blocks in c, whereby Sa is older than Sb means: 
^li^lj • Sa G c{li^^Sb G c(/j),2 > J. 



May Analysis. To determine, if a memory block Sx will never be in the cache, 
we compute the complimentary information, i.e., sets of memory blocks that 
may be in the cache. “Good” information is that a memory block is not in this 
set, because this memory block can be classified as definitely not in the cache 
whenever execution reaches the given program point. Thus, the smaller the sets 
are, the better. Additionally, the older blocks will reach the desired situation to 
be removed from the cache faster than the younger ones. Therefore, the partial 
order on this domain is as follows. Take some abstract cache state c. Above c in 
the domain, i.e., less precise, are those states which contain additional memory 
blocks or where memory blocks from c are younger than in c. Therefore, the 
U-operator applied to two abstract cache states ci and 62 will produce a state 
c containing those memory blocks contained in ci or C2 and will give them the 
minimum of their ages in c\ and C2 (see Figure 5 ). 

The positions of the memory blocks in the abstract cache state are thus the 
lower bounds of the ages of the memory blocks in the concrete caches occurring 
in the collecting cache semantics. 




"union 

+ minimal age" 



Fig. 5. Combination for may analysis 



The solution of the may analysis problem is interpreted as follows: The fact 
that Sx is in the abstract cache c means, that Sx may be in the cache during some 
execution when the program point is reached. We can even say, that memory 
block Sx G c{li) will surely be removed from the cache after at most n — i + 1 
references to memory blocks that are not in the cache or are older or of the same 
age than Sx, if there are no memory references to Sx- Sa is older than or of the 
same age as Sb means: 3 li,lj : Sa G c{li),Sb G c{lj),i > j. If Sx is not in c{li) for 
any h then it will definitely be not in the cache on any execution. A reference 
to Sx is categorized as always miss (am). 
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3.3 Worst Case Path Analysis Using Integer Linear Programming 
(ILP) 

The structure of a program and the set of program paths can be mapped to 
an ILP in a very natural way. A set of constraints describes the control flow of 
the program. Solving these constraints yields very precise results [22] . However, 
requirements for precision of the analysis results demands regarding basic blocks 
in different contexts, i.e., in different ways, how control reached them. This 
makes the control quite complex, so that the mapping to an ILP may be very 
difficult [23]. 

A problem formulated in an ILP consists of two parts: the cost function 
and constraints on the variables used in the cost function. Our cost function 
represents the number of CPU cycles. Correspondingly, it has to be maximised. 
Each variable in the cost function represents the execution count of one basic 
block of the program and is weighted by the execution time of that basic block. 
Additionally, variables are used corresponding to the traversal counts of the 
edges in the control flow graph. 

The integer constraints describing how often basic blocks are executed rela- 
tive to each other can be automatically generated from the control flow graph. 
However, additional information about the program provided by the user is usu- 
ally needed, as the problem of finding the worst case program path is unsolvable 
in the general case. Loop and recursion bounds cannot always be inferred auto- 
matically and must therefore be provided by the user. 

The ILP approach for program path analysis has the advantage that users 
are able to describe in precise terms virtually anything they know about the 
program. In our system, arbitrary integer constraints can be added by the user 
to improve the analysis. The system first generates the obvious constraints auto- 
matically and then adds user supplied constraints to tighten the WCET bounds. 



4 WCET Determination by MC 

Each abstract interpretation has encoded in it the type of invariants it is able to 
derive. It analyzes given programs and derives invariants at each of their program 
points, called local invariants. These invariants then imply safety properties of 
the execution states at these program points, analogously called local safety 
properties. In our application, we are interested in the verification of the largest 
subset of a set of local safety properties, that could hold at a program point. 
The fundamental question is, what on the MC side corresponds to this ability 
of verifying an a priori unknown set of local safety properties. The following 
analogy comes to one’s mind. P. Cousot has shown in [3] that the well-known 
Constant Propagation analysis cannot be done by model checking. A constant- 
propagation analysis determines at each program point a set of variable-constant 
pairs (x, c) such that x is guaranteed to have value c each time execution reaches 
this program point. The model checker would need to know the constant value 
c of program variable x in order to derive that x has value c at program point n 
whenever execution reaches n. 
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Here are two alternatives, how WCETs could be determined by MC. They 
all use exponential search to compute upper bounds instead of YES/NO answers 
in the following way: 

A measured execution time is taken as intital value for an upwards search dou- 
bling the expected bound each time, until the MC verifies that the bound suffices. 
A consecutive binary search determines an upper bound. This process needs a 
logarithmic number of runs of the model checker. An added value of this proce- 
dure may be that it delivers two values enclosing the WCET at convergence. 

Let us regard the question raised above under two different definitions of 
model checking: 

Model checking is fixpoint iteration without dynamic abstraction and using 
set union to collect states. This coincides with the view that model checking is 
Reachability Analysis. In contrast, abstract interpretation would be considered as 
fixpoint iteration with dynamic abstraction using lattice join to combine abstract 
states. 

This would, however, need to search an exponential state space. Let us regard 
again the subproblem of cache-behavior prediction. In Section 3, it was described 
how abstract cache states compactly represent information about sets of concrete 
cache states. Practice shows that little precision is lost by the abstraction. On 
the other hand, for an o-way set-associative cache the number of contents of one 
set is large, namely a! if fc is the number of memory blocks mapped to this 
set. 

Model checking checks whether a given (global) property holds for a finite 
transition system. As explained above, we are interested in verifying as many 
local safety properties at program points as possible. Hence, we either do 0{n) 
runs of the model checker for a program of size n with a possible large constant 
stemming from the number of potential safety properties, or we code these all 
together into one 0(n)-size conjunction. 



5 WCET Determination by ILP 

Li, Malik, and Wolfe were very successful with an ILP-only approach to WCET 
determination [11,12,13,14] ... at least as far as getting papers about their ap- 
proach published. Cache and pipeline behavior prediction are formulated as a 
single linear program. The i960KB is investigated, a 32-bit microprocessor with 
a 512 byte direct mapped instruction cache and a fairly simple pipeline. Only 
structural hazards need to be modeled, thus keeping the complexity of the inte- 
ger linear program moderate compared to the expected complexity of a model 
for a modern microprocessor. Variable execution times, branch prediction, and 
instruction prefetching are not considered at all. Using this approach for super- 
scalar pipelines does not seem very promising, considering the analysis times 
reported in the article. 

One of the severe problems is the exponential increase of the size of the ILP 
in the number of competing /-blocks. I -blocks are maximally long contiguous 
sequences of instructions in a basic block mapped to the same cache set. Two 
/-blocks mapped to the same cache set compete if they do not have the same 
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address tag. For a fixed cache architecture, the number of competing Fblocks 
grows linearly with the size of the program. Differentiation by contexts, abso- 
lutely necessary to achieve precision, increases this number additionally. Thus, 
the size of the ILP is exponential in the size of the program. Even though the 
problem is claimed to be a network-flow problem the size of the ILP is killing the 
approach. Growing associativity of the cache increases the number of competing 
/-blocks. Thus, also increasing cache-architecture complexity plays against this 
approach. 

Nonetheless, their method of modeling the control flow as an ILP, the so- 
called Implicit Path Enumeration, is elegant and can be efficient if the size of 
the ILP is kept small. It has been adopted by many groups working in this area. 

The weakness of my argumentation against an ILP-only approach to WCET 
determination is apparent; I argue against one particular way of modeling the 
problem in one ILP. There could be other ways of modeling it. 

6 Conclusion 

The AI-I-ILP approach has proved successful in industrial practice. Results, at 
least on disciplined software, but quite complex processors are precise [24] and 
are produced quickly. Benchmark results reported there show an average over- 
estimation of execution times of roughly 15%. This should be contrasted with 
results of an experiment by Alfred Rosskopf at EADS [19]. He measured a slow 
down of a factor of 30 when the caches of a PowerPC 603 were switched off. The 
benchmark described in [24] consisted of 12 tasks of roughly 13 kByte instruc- 
tions each. The analysis of these tasks took less than 30 min. per task. 

The approach is modular, in that it separates processor behaviour prediction 
and worst-case path determination. Reasons for the efficiency of the approach are 
twofold: AI used for the prediction of processor behaviour uses only one, albeit 
complex fixpoint computation, in which it verifies as many safety properties, i.e. 
absence of timing accidents, as it can. The remaining ILP modeling the control 
flow of the program, even when expanded to realize context differentiation, is 
still of moderate size and can be solved quickly. 

As Ken McMillan pointed out, model checking may deliver more precise 
results since it exhaustively checks all possibilities, while abstract interpretation 
may loose information due to abstraction. Our practical experience so far does 
not show a significant loss of precision. On the other hand, none of the described 
alternatives of using MC seems to offer acceptable performance. It could turn 
out that the exponential worst case does not show up in practice as is the case 
in the experiments on byte-code verification reported in [2]. 

MC could check the conjunction of a set of safety properties in polynomial 
time, after these have been derived by abstract interpretation. 

Only one ILP-only approach could be found in the literature. It had severe 
scaling problems as the size of the ILP grows exponentially with the size of the 
program to be analyzed. 

Acknowledgements. Mooly Sagiv, Jens Palsberg, and Kim Guldstrand Larsen 
confronted me with the claim, that we could have used MC instead of AI -|- ILP. 
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This motivated me to think about why not. Andreas Podelski clarified my view 
of things while defending their claim. Alberto Sangiovanni-Vincentelli suggested 
that [11,13,12,14] had already solved the problem using ILP provoking me to 
find out, why they hadn’t. Many thanks go to Reinhold Heckmann, Stephan 
Thesing, and Sebastian Winkel for valuable discussions and detailed hints and 
to the whole group in the Compiler Design Laboratory and at Absint for their 
cooperation on this successful line of research and development. 
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Abstract. Biological systems can be modeled beneficially as reactive 
systems, using languages and tools developed for the construction of 
man-made systems. Our long-term aim is to model a full multi-cellular 
animal as a reactive system; specifically, the C. elegans nematode worm, 
which is complex, but very well-defined in terms of anatomy and ge- 
netics. The challenge is to construct a full, true-to-all-known-facts, 4- 
dimensional, fully animated model of the development and behavior of 
this worm (or of a comparable multi-cellular animal), which is multi-level 
and interactive, and is easily extendable - even by biologists - as new 
biological facts are discovered. 

The proposal has three premises: (i) that satisfactory frameworks now 
exist for reactive system modeling and design; (ii) that biological research 
is ready for an extremely significant transition from analysis (reducing 
experimental observations to elementary building blocks) to synthesis 
(integrating the parts into a comprehensive whole), a transition that 
requires mathematics and computation; and (iii) that the true complexity 
of the dynamics of biological systems - specifically multi-cellular living 
organisms - stems from their reactivity. 

In earlier work on T-cell reactivity, we addressed the feasibility of mod- 
eling biological systems as reactive systems, and the results were very 
encouraging [1]. Since then, we have turned to two far more complex 
systems, with the intention of establishing the basis for addressing the 
admittedly extremely ambitious challenge outlined above. One is mod- 
eling T-cell behavior in the thymus [2], using statecharts and Rhapsody, 
and the other is on VPC fate acquisition in the egg-laying system of C. 
elegans [3], for which we used LSCs and the Play-Engine [4]. 

The proposed long term effort could possibly result in an unprecedented 
tool for the research community, both in biology and in computer science. 
We feel that much of the research in systems biology will be going this 
way in the future: grand efforts at using computerized system modeling 
and analysis techniques for understanding complex biology. 



* Full paper in EATCS Bulletin, European Association for Theoretical Computer 
Science, October, 2003. (Early version prepared for the UK Workshop on Grand 
Challenges in Computing Research, November 2002.) 
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